|
|
||
|
|
ABSTRACTS
Inter-operator Test for the Clicking of Polylines in Earprints The Netherlands Forensic Institute is a participant in the EU project on Forensic Ear Identification, FearID. For this project, sets of earprints have been collected from 1500 donors from several countries. Furthermore, software has been developed that scans and stores earprints, and processes them in different ways. A user may, for example, manually add a polyline to an image, following the imprint of the ear, from which a connected structure is determined that covers and represents the imprint, the 'superstructure'. We performed a validation experiment to clarify stability aspects of the clicking of these polylines. We find a small inter-operator effect, meaning that a constant operator yields slightly more consistent polylines than different ones. From the viewpoint of the resulting superstructures, using the width of the superstructures, weighted with respect to local intensities, there is no noticeable inter-operator effect. It Was One of My Brothers When DNA evidence is used to implicate a suspect, it may be of interest to know whether it is likely that the suspect's near relatives also share the suspect's DNA profile. We discuss methods for evaluating the probability that at least one of a set of the suspect's full-or half-siblings shares the suspect's DNA profile. Three such methods will be presented: exact calculation, estimation via Monte Carlo simulations, and estimation by means of sandwiching the probability between an upper and a lower bound. We show that, under many circumstances, this upper bound itself provides an extremely quick and accurate estimate of the probability that at least one of the relatives matches the suspect's profile. This work differs from the groundbreaking paper of Evett in 1992 by taking into account the dependencies among the brothers' profiles and in allowing for both full- and half-brothers. The work was prompted by a case involving a large number of half-sisters, not all of whom were equally related to each other. Handling Manipulated Evidence Bayesian Networks (BNs) have been advocated for describing the relations of dependence among random variables and relevant hypotheses in a crime case. Moreover, they have been applied to help the investigator structure the problem and weigh the observed evidence, typically with respect to the hypothesis of guilt of a suspect. Since the investigation is performed in a more formal way, as compared to the usual practice, a possible subtle drawback in the use of BN-based investigations is overconfidence in the results. The most dangerous possibility occurs if manipulated evidence is introduced. Examples include false testimony and blood traces left to mislead the police. The objective of this work is to build a model that can help the investigator handle possibly manipulated evidence. Starting from the original BN structure provided by the expert, a new structure is built replacing the original Conditional Probability Tables (CPTs) of the random variables whose instantiations are supposed to be manipulated with the intervention model often used in causal inference. We motivate our choice and show the conditions under which it is possible to produce an updating of the probability that the observed evidence is in fact genuine or manipulated, as well as the posterior distribution of the relevant hypotheses. Finally, comparing all the models derived from different configurations of more than one pieces of (possibly) manipulated evidence, we show how to detect a criminal plan aimed at misleading the investigation. A Bayesian Model to Control for Selection Bias, with an Application to Racial Profiling This paper introduces a Bayesian model to control for selection bias. The primary innovation is that the model does not require individual-level data for the underlying population; thus, it has broader application than the standard Heckman-style selection models. Instead, the model relies upon population-level data to identify an appropriate prior. The paper provides an application in the racial profiling context, where data of stops and searches of individual motorists on the highway exist, but there is no individual-level data on the underlying population of motorists on the highway. Data Fusion, Data Mining and Pattern Recognition Applied to Fiber Analysis Fibers are a ubiquitous form of transfer evidence amenable to multiple forensic analyses. These techniques include light and polarized-light microscopy, UV/VIS microspectrophotometry, IR microspectrophotometry, and Raman microspectrophotometry. Additional destructive analyses that may be used include chromatographic and instrumental analysis of dyes and polymers, melting points, and solubility. Thus, even a single fiber can generate an enormous amount of analytical data, but more is not necessarily better. Certainly, combining tests increases the discrimination power of the analytical scheme; however, time, cost, and evidentiary value also must be considered. Furthermore, issues such as inter- and intra-sample variation, sampling effects, and quality control have yet to be systematically addressed in this context. This is a chemometric challenge that relates to data fusion and pattern recognition, but findings can be of practical interest to investigators and forensic scientists. For example, an analyst might like to know what combination of analyses provide the most valuable discriminatory information. Is it one test or a well designed and validated combination of tests? Conversely, what tests are of limited value? What is the cost-benefit tradeoff inherent in destructive verses destructive testing? If a laboratory is to invest in new equipment, what makes the most sense in light of analytical power? How can the results of such modeling be quantitatively, accurately and fairly presented to a jury? This presentation will discuss the results to date of a comprehensive fabric and fiber analysis project utilizing the analytical methods mentioned above. The results of optical, spectral, and destructive analyses have been integrated into a comprehensive hierarchical database and evaluated using multivariate statistics, pattern recognition, and data mining techniques. Issues of statistical significance, feature selection, variable weighting, probabilities, uncertainties, and analytical power will be addressed. Jury presentation of findings will also be discussed as will the utility of commonly available commercial software for these modeling applications. Multivariate Characterization of XTC The Netherlands is one of the largest XTC producing countries in the world. For this reason the project “samenspannen tegen XTC” (fighting together against XTC) started in 2002. This project focuses on intensifying the fight against the production and trade of synthetic drugs. One of the goals within this project is classifying impounded XTC tablets and linking them to producing laboratories. Direct linking however is difficult, because only very seldom tablets are found in laboratories. (This is in contrast to powders from which the tablets are made, but these powders are hard to link to impounded tablets. In this paper classical and Bayesian statistical techniques for the classification of both tablets and powders are discussed. This provides insight in why certain tablets (and powders) are grouped together. In particular, we consider whether the grouping is based on the production method, ground material, filling material, or other factors. In future research, we hope to use this information to find relations between XTC groups and certain producers. Problems and some solutions of the current research will be presented. Represented by Proxy: Finding Relatives Using Offender DNA Databases If a man commits an offense that results in cataloguing his DNA in an offender database, then inevitably his identical twin who has committed no offense is also catalogued. In a recent UK case a relative of a listed offender was tracked down thanks to a rare familial allele. These are odd cases, but with more sophisticated genetic analysis algorithms that could easily be added to the database searching, relatives could be tracked down routinely. This sort of occurrence raises two kinds of questions: (1) How far and wide can it go? (2) What are the social and legal ramifications? To the first point we present simulations, data, and arguments that suggest the effective net can be quite broad. To the second, we hope to stimulate discussion. The Probative Value of Bullet Lead Evidence Forensic evidence is often introduced in trials to help on decisions of guilt or innocence. The question that forensic experts attempt to answer is whether two items, one found at the crime scene and one found on a suspect, have a common origin. These types of questions have been largely addressed when trace evidence is biological in nature; a good example is DNA analysis, now commonplace in the courtroom. Determining the probative value of other types of evidence including bullet lead composition is more challenging. While establishing a match between two samples might not be problematic, deciding on the significance of the match is where the problems start. We briefly revisit the statistical issues that are relevant in the analysis of bullet lead compositional data and consider the steps that might be needed in order to develop a probability-based approach to estimating the significance of a match between two samples. We compare the approaches for analyzing lead compositional data that have been proposed by the FBI and by researchers at Iowa State University, and briefly consider the recommendations provided by the National Research Council committee that recently published a report on the topic. Searching a Database of DNA Profiles: Theory and Practice This paper evaluates forensic identification hypotheses conditionally to the characteristics observed both on a crime sample and on individuals profiled in a database. First we solve the problem via a computational efficient Bayesian Network obtained by transforming some recognized conditional specific independencies into conditional independencies. Then we propose an Object Oriented Bayesian Network representation, first considering a generic characteristic, then inheritable DNA traits. We show how to use the Object Oriented Bayesian Network approach to evaluate hypotheses concerning the possibility that some unobserved individuals, genetically related to the individuals profiled in the database, are the donors of the crime sample. We also propose three applications. The first is a simulation study helpful in evaluating the identification ability of the proposed method. Then, we give an estimation of the elapsed time required to obtain results in a search for a single locus, providing results conditionally to the database size, the number of alleles and two different machines. Third, we report the results of a real database's screening obtained by making a search on a real DB, showing how enlarging the search to individuals related to the database members has given interesting investigative results. We finally illustrate the original software produced. Forensic Stylistics: Scientific Reliability This study provides an overview of forensic stylistics and discusses the admissibility of forensic stylistics as scientific evidence in terms of its general scientific reliability. Cases that have dealt with the admissibility of linguistic analysis are summarized, and the discussion proceeds by dividing the methodologies according to whether they are qualitative or quantitative. Potential drawbacks in qualitative methods are highlighted and recent developments in quantitative methods are summarized. In particular, this study notes a significant transition from qualitative to quantitative methodologies in authorship identification. Within quantitative methods there is a trend to use Principal Components Analysis as a primary statistical technique and to focus on syntactic patterns rather than function words. It is suggested that quantitative methodologies could produce reliable scientific evidence as long as other variables such as genre, time gap between texts or the size of a corpus are so controlled that their effect on the reliability of the results can be minimized. Natural Experiments on Latent Print (Fingerprint) Accuracy A number of recent U.S. court decisions and articles have treated casework as natural experiments measuring the accuracy of latent print (fingerprint) identification. This paper discusses flaws in those treatments. The paper then proposes an alternative way of treating casework as a natural experiment measuring the accuracy of latent print identification. The paper discusses a study of known misattributions to try to estimate the potential rate of false attribution in latent print casework. It concludes that there is reason for significant concern about false attribution in latent print identification. A MCMC Method for Resolving Two-person Mixtures This paper describes a Monte Carlo Markov chain method for resolving DNA mixtures containing at most four peaks per locus into a major and a minor contributor. Unlike previous methods, this method can provide posterior probability assessments of the most probable genotype and a likely range for the mixing proportion. Testing the Effects of Selected Jury Trial Innovations on Juror Comprehension of Contested mt DNA Evidence In most cases, American jurors receive high marks for their abilities to comprehend evidence and law. However, many critics of the jury voice concerns about the jury’s ability to understand and properly employ complex scientific evidence, especially when statistical presentations are involved. For example, research reveals that jurors often struggle in comprehending and correctly weighing contested forensic DNA evidence. At the same time that complex scientific evidence and statistics are being used more and more in court, a jury trial reform movement has emerged. The objective of this movement has been to increase and optimize juror understanding of complex evidence by introducing educational tools and decision aids to the traditional trial format. Although some jury reforms such as notetaking have enjoyed widespread adoption, adoption of other innovations has been hampered by heavy reliance on tradition and relative lack of empirical research demonstrating effects of these jury reforms on juror comprehension of complex evidence. The National Institute of Justice has committed to improving the use of DNA in criminal cases. This presentation will discuss a research project funded by NIJ to examine empirically the effects of jury trial reforms on juror comprehension of contested mitochondrial DNA (mtDNA) evidence. The first half of the presentation (by BMD) will discuss the research objectives, design and overall findings. More specifically, this discussion will address which jury trial innovations were utilized and their impact on juror comprehension of complex scientific evidence. The second half of this presentation (by EJF) will take a closer look at how jurors involved in the mock jury project comprehend, utilize, and weigh the statistical and probabilistic evidence offered by both the defense and prosecutors expert witnesses. This presentation will examine jurors on the individual level and also as a group during the deliberation process. Questionnaire results and transcriptions of juror deliberations will be used to highlight the following issues: accuracy in jurors mtDNA definitions and comprehension; impact of education on juror comprehension; occurrence of "defense" or "prosecutor" fallacies by jurors; effects of deliberation on mtDNA knowledge; and the relationship between mtDNA comprehension and verdict preferences. Point Process Models on Minutiae Features for Assessing Fingerprint Individuality When a query fingerprint is declared as a match, or non-match, with the template of a claimed identity, measures of confidence associated with the observed degree of match are currently either unavailable or unsatisfactory. The question we address relates to the individuality of fingerprints: How similar should two fingerprints be before we can conclude that they are from the same finger? To answer this question satisfactorily, appropriate statistical models need to be developed on the space of fingerprint features which are able to capture all possible variations of the features across fingerprint images. In this paper, we focus on minutiae points as our fingerprint feature and develop a family of non-homogeneous Poisson point process models to represent their spatial distribution in fingerprint images. The elicited models are flexible enough to capture all aspects of spatial variability of the minutiae points. We propose p-values as the statistical measures of confidence for assessing fingerprint individuality. A procedure for calculating p-values corresponding to the observed number of matching minutiae points between the query and template fingerprint images is described. For the "12-point match" criteria, experiments using the fingerprint databases NIST4 and FVC2002 yield p-values ranging from 10-19 to 10-13, and 10-16 tp 10-11, respectively, when the number of query and template minutiae points range from 20 to 40. Building Blocks for DNA Identification from Bayesian Networks Problems of forensic identification from DNA evidence can become extremely challenging, both logically and computationally, in the presence of such complicating features as missing data on individuals, mixed trace evidence, mutation, silent alleles, laboratory and handling errors, etc. etc. In recent years it has been shown how Bayesian networks can be used to represent and solve such problems. The latest "object-oriented" version (version 6.4) of the Bayesian Network software system Hugin allows a network to contain repeated instances of other networks. This architecture proves particularly natural and useful for genetic problems, where there is repetition of such basic structures as Mendelian inheritance or mutation processes. I will describe a "construction set" of fundamental networks, that can be pieced together, as required, to represent and solve a wide variety of problems arising in forensic genetics. Some examples of their use will be provided. Discrimination as Causation: Implication for the Presentation of Statistical Evidence in Legal Settings. The legal descriptions of discriminatory behavior employ the language of counterfactuals, language akin to that used in the statistical literature on causation. This presentation explores some of the implications of measuring causation for discrimination. In particular, I argue, in part based on material found in the recent report of the NAS-NRC Panel on Measuring Discrimination, that this causal language poses a greater demand on statistical evidence of discrimination in legal settings than many of those who have worked in the area have recognized. Compositional Analysis of Bullet Lead as Forensic Evidence This paper describes the 2004 NRC Report, Forensic Analysis: Weighing Bullet Lead Evidence, and the issues it addresses. Methods for Measuring the Fairness of the Allocation of Shares in Initial Public Offerings During the late 1990s, substantial profits were made from shares in internet and information technology companies that went public as their initial offering price was less than the price that their shares brought on the first day they publicly traded. Some investment companies made special arrangements, which were improper, with “favored” customers who promised them extra business. Several companies signed consent decrees with the authorities and paid fines for sharing profits with some clients. The authors were engaged by a law firm to examine whether a set of customers, selected by NASD after examining the trading histories of their client firm’s customers for two years, had been favored in IPO allocations. This paper describes several measures used to assess the fairness of these allocations. The first type compared the success rates of various groups of customers to those in the accused group. Since one would expect that customers who had given a firm more business in the past, would receive more shares than smaller customers we used prior business to determine the fraction of shares the customers who requested an IPO should receive. An analog to the “excess relative risk” measure used in epidemiology based on the difference between the observed and expected number of shares for all the IPO deals during the period was developed to assess fairness in allocation. It turned out that the so called “profit sharers” received less than expected. Various sensitivity analyses also will be described. Methods for Assessing Whether Data Underlying Statistical Analyses Submitted as Evidence in Courts Satisfy the Assumptions Required for the Results to be Reliable Many commonly used statistical methods were developed for data that come from a normal or bell-shaped distribution and are collected as a simple random sample from the relevant population. The database available in legal cases often was collected for administrative purposes rather than for statistical analysis, and the statistician using the data needs to check that the data are consistent with the assumptions underlying the methods used to draw inferences relevant to the legal action. In Branion v. Gramley, 855 F.2d 1256 (7th Cir. 1988), for example, a sample of driving times was assumed to have a normal distribution but this assumption was not checked. In a recent securities law case, one party’s expert originally used the Pearson correlation coefficient to measure the association between two variables. These data will be shown to violate the assumption of normality, and a noticeably different inference is obtained when the Spearman rank correlation, which is valid for all distributions, is used. When data are collected over time, observations nearby one another are related or dependent rather than independent. This problem arose in a jury discrimination case, Moultrie v. Martin, 690 F.2d 1078, 1082 (4th Cir. 1982), as well as in the securities law case for which we were consultants to a law firm. Finally, a method for checking whether consecutive observations are independent will also be described, and the effect of dependence on the commonly used Wilcoxon two-sample test and Shapiro-Wilk test will be illustrated. A Probability Argument for the Preponderance (or not) of Evidence Regarding Rare Events: The Source of an Adverse Event The P > ½ rule for the preponderance of evidence is considered in a useful context. Consider a group of objects where, for m of them, the status “yes”, “yes with a given probability” or “no” is known for each. For the remainder, n of them, the status is completely unknown. However, certain statistical information about the status is known for the group of n objects. In a particular application, of the group of m objects of known status, one object is a “yes with a high probability” and each of the other m-1 objects is a “no”. Exposure to a “yes” causes an adverse event. Someone is exposed to all m+n objects and experiences an adverse event. Attorneys wish to know the probability that the adverse event is attributable to the object with the probable “yes”. This paper considers this and related questions. The litigation concerned an acquisition of pediatric HIV and the question of whether this acquisition is attributable to a specific plasma donor, where, over time, the patient received plasma from m+n = 431 plasma donations. Advantages and disadvantages of conservative probability arguments used in litigation are discussed. Evaluation of Extreme-value Evidence The evaluation of evidence often requires consideration of probabilities in very sparse areas of a multivariate sample space. It is important that correct models are used and that the values of the associated probabilities or probability densities are estimated accurately. In particular, if both the values in the numerator and in the denominator are extremely small it is important to know the sensitivity of the likelihood ratio to the precision of the estimates of these values. The results of an investigation of the behaviour of the likelihood ratio in sparse areas of the sample space will be reported. Interval Estimates of Ages of Insect Larvae Based on Multivariate Size Data with Adjustment for Covariates A carrion-fly larva collected from a corpse may be used to estimate a minimum time since death. It has multivariate size measurements y (for example, body length, dry weight, and length of the dorsal cornu), and it grew under conditions x (e.g., degree-hours). Its age d is unknown. Training data are available for y from multiple specimens of the same species over a grid of values of conditions x and ages d. Using interpolation models for the mean vector µ(x,d) and variance-covariance matrix S(x,d) of y as functions of age d and the covariates x, we describe and illustrate a method for constructing an interval estimate of the age d of the mystery specimen. The method is based on a heteroscedastic multivariate mixed model for y, which extends our earlier work in the Journal of Forensic Sciences, 1995, 40:585–590. The Office of Federal Contract Compliance Programs: New Statistical Initiatives The Office of Federal Contact Compliance Programs (OFCCP) is responsible for enforcing provisions that prohibit discrimination by federal contractors and subcontractors with regard to race, sex, ethnicity, national origin, religion, disability, or status as a Vietnam era or special disabled veteran. To enforce these provisions, the agency selects a sample of federal contractor establishments each year and requests information on the contractors hiring, compensation, promotion and termination patterns. In 2003, the OFFCP created the Division of Statistical Analyses. This division’s goal was to broaden OFCCP’s review of discrimination issues and to bring greater accuracy to the compliance review process through current statistical data analysis techniques and investigative procedures. This paper will present a summary of OFCCP’s prior focus areas and investigative techniques, the challenges OFCCP faces in expanding the scope and accuracy of the review process, and their plans for achieving their objectives. OFCCP’s principal challenge is to account for differences in the duties, skills and responsibility level of jobs and the employee’s or applicant’s qualifications for the job in assessing discrimination issues. Given the challenge surrounding these factors and with limited resources, OFCCP traditionally limited the scope of the compliance review to ascertain whether the company acted fairly in the hiring of unskilled labor using simple data collection and categorical data analysis techniques. We will review OFCCP plans for assessing similarities between reported positions coupled with linear and non-linear regression analysis techniques to reach the agency’s new objectives. We will also discuss comparative strategies and the issues OFCCP has faced with these approaches to guide the current direction. Structural Taxonomies of Legal Evidence All entities are classifiable into generic types. The classification forms a taxonomy, or ontology in artificial intelligence terminology. The structural elements used in the taxonomy employed to generate knowledge about the entities in the taxonomy. The same should be true of legal evidence. Legal evidence can be classified into groups, and the underlying structuring elements used to form the taxonomy can be used to tell us something about the nature of that evidence. However, entities which are evidence types present more difficulties than other entity types as some of the structuring elements rely upon the propositional network to which the evidence type pertains. This paper will try to unravel some of the contradictions and problems which this presents. Variability in the Refractive of Glass In criminal investigations involving glass evidence, refractive index (RI) is the property of glass most commonly used by forensic examiners to determine the association between control samples of glass obtained at the crime scene, and samples of glass found on a suspect. Previous studies have shown that an intrinsic variability of RI exists within a pane of float glass. In this paper, we attempt to determine whether this variability is spatially determined or random in nature, the conclusion of which plays an important role in the statistical interpretation of glass evidence. We take a Bayesian approach in fitting a spatial model to our data, and utilize the WinBUGS software to perform Gibbs sampling. To test for spatial variability, we propose two test quantities, and employ Bayesian Monte Carlo significance tests to test our data, as well as other specifically formulated data-sets. Multivariate Statistical Approaches for the Discrimination of Textile Fibers by UV/visible and Fluorescence Microspectrophotometry Trace evidence has taken on a role of increasing importance in forensic investigations. The principle that "every contact leaves a trace" establishes the potential value of minute traces of evidence found at the crime scene, or found on a victim or suspect. Fiber evidence is class evidence (i.e., not unique), because many fibers from different sources could be indistinguishable. The discovery of a fiber and its identification as a particular fiber type (e.g., acrylic, cotton, nylon, polyester) may not, of itself, provide much support for a forensic investigation. The probative value of particular fibers found at a crime scene depends on their uniqueness relative to the background of fibers normally encountered at that location in the absence of the crime. What is often required is information that makes the trace evidence more specific and discriminating. Multivariate statistics offers an interpretative methodology that can be readily applied to data generated by modern analytical techniques such as microspectrophotometry. We have previously applied multivariate statistics to forensic problems such as discrimination of automotive paint samples, copy toners, and fluorescent brighteners. We have now developed a database of over 500 dyed textile fibers were collected from commercial sources with information characterizing these fibers. Over 25,000 spectra, consisting of UV-visible absorbance spectra and fluorescence spectra taken at four excitation wavelengths (365, 405, 436, and 546 nm), were also acquired. Principal component analysis and linear discriminant analysis are of great utility in exploring relationships among groups of spectra, in visualizing differences between groups of spectra, in assessing quantitatively the discriminating power of different spectroscopic techniques, and confirming the statistical validity of discrimination observed in UV-visible and fluorescence microspectrophotometry. The fibers and associated spectra in the database represent an extensible tool for fiber comparisons in casework, and should also be of value in quality control and training of analysts. Identifcation and Separation of DNA Mixtures Using Peak Area Information We show how probabilistic expert systems can be used to analyse forensic identifcation problems involving DNA mixture traces using quantitative peak area information. Peak area is modelled with conditional Gaussian distributions. The expert system can be used not only for ascertaining whether individuals whose profles have been measured have contributed to the mixture, but also to infer DNA profles of unknown contributors by separating the mixture into its individual components. The potential of our methodology is illustrated on case data examples and compared with alternative approaches. The advantages are that identifcation and separation issues can be handled in a unified way within a single network model, and the uncertainty associated with the analysis is quantified. Sample-size Determination for Analysis of Ecstasy Pills By experience, a seizure of pills that is under suspicion to contain drugs will very likely either consist entirely of drug pills of the same kind or consist of pills with no drug-content at all. If this experience could be quantified, it is possible to reduce the number of pills that must be selected for analysis. Recent results (Aitken, 1999; Coulson et al, 2001) show that a Bayesian approach to sample size determination expresses the problem in a more natural way from a forensic point of view, provided an informative prior can be defined. Also, the sample sizes can be further reduced in this framework compared with the more classical Hypergeometric approach. These results have been adopted by the European Network of Forensic Sciences Institutes (ENFSI) in the Guideline on Representative Drug Sampling, published by ENFSI Drug Working Group. In this text, as well as in other published results, it is suggested to use a beta-prior which should be highly left-skewed when assumptions of the above kind can be done. In this paper we show how a beta-prior can be calculated from a data-base of analyzed Ecstasy pills. By dividing the data-base items into different sub-populations, it is possible to estimate the parameter in the prior that controls the left-skewness of the distribution. Statistical Evidential Assessment of Partial Fingermarks Recent challenges towards fingerprint evidence combined with recent cases of wrong identification have strengthened the need for statistical research to underpin the evidential assessment of fingerprints. Previous studies, carried out on models based on minutiae configurations, have several limitations. They did not capture spatial relationships between minutiae; at the core of each model, independence between characteristics measured from minutiae was assumed without sufficient testing; they focused on match probabilities with a weak account of tolerances due to distortion or clarity of marks. Most of the proposed models were also not subjected to extensive empirical validation. The present research addresses some of these limitations through the development of a model that also captures spatial relationships between minutiae. This model incorporates variability inherent in marks left by the same finger due to distortion and clarity. The strength of evidence is expressed using a likelihood ratio for weighing within-finger distortion and between-finger variability. The model does not rely on independence assumptions. The aim of this model is not demonstrating individuality of a complete and well-reproduced fingerprint, but assessing the evidential contribution for marks that can be partial and distorted and with a poor signal to noise ratio. Demonstration of Minority Disadvantage when Minority Populations Are Small The city of Chicago has a set-aside program to aid construction businesses owned by disadvantaged minorities. Determination of whether or not a minority population is in fact disadvantaged is based upon economic information from the US census 5% sample. Census data indicates economic disadvantage to the Asian community in Chicago. However the size of the Asian population, in particular size of the sample of self-employed Asians in the construction industry is small. Statistical analysis of this data does not rise to the level of a statistically significant difference between Asian in the construction industry and white males in the construction industry. The city of Chicago limited its analysis to the city’s population under the belief that only that population was relevant to the identification of disadvantaged populations. Populations in similar situations in other cities with the same relevant characteristics were deemed irrelevant. In contrast, statistical analysis of the census data from other areas of the country, in which there are large Asian populations, was used to demonstrate a statistically significant pattern of disadvantage for Asians in the construction industry as compared to white males. This comparison along with the demonstration that the pattern in Chicago is similar to the patterns in other areas will be used to make a case for inclusion of Asian construction businesses in the city of Chicago minority set aside program. A Statistical Procedure and Monte Carlo Simulation to Analyze a Creel/Angler Survey: Test of Validity and Estimation of Accuracy Multiple interview longitudinal creel/angler surveys are used to estimate the number of anglers and to estimate annual fish consumption, the information necessary to conduct an EPA risk assessment. Typically it is not possible to directly select a random sample from the population of anglers; a random sample of days is chosen for conducting interview of the anglers. Standard statistical methods cannot be applied directly. A statistical procedure was designed to characterize angler activities based on the survey data and to calculate exposure factors necessary for illustrating the fish consumption pathway of recreational anglers in a human health risk assessment for the river. The procedure is based upon basic laws of probability and produces estimates of fish consumption suitable for EPA-type risk assessments. To test the efficacy of the statistical procedure, a Monte Carlo simulation was developed. This allowed researchers to generate simulated angler populations of varying sizes and fishing characteristics and to test the creel/angler estimation procedure against the known populations produced by the simulation. The simulation results demonstrate the validity of the proposed estimation procedure and can be used to provide confidence bound estimates of fish consumption. Dealing with Selection Effects in Forensic Science Many areas of forensic science deal with questions concenrning the selection of evidence. For example, in fiber casework a foreign fiber found on a murder victim may be compared to a few items related to the suspect, such as his clothing, or to a large number of items, e.g., when his whole house, office and car is searched. How does the evidential value of a matching fiber depend on the way the fiber was selected? A second example is facial comparison, where a suspect is compared to a robber on a video tape. How should the facial comparison expert take account of the fact that the suspect is identified by the general public through showing the tape on the television? A third example is statistical evidence used in cases where someone is involved in a strikingly high number of incidents. This type of evidence was recently used in a criminal case against a nurse in the Netherlands. The statisticians involved in this case disagreed on the "post hoc correction" for the fact that the nurse was picked out precisely because she was involved in so many incidents. Current literature warns against the "selection effect", but is rather vague about how to deal with it. I will argue that in the classical (frequentist) approach, corrections for the selection effect are problematic. In the Bayesian framework, selection is dealt with in an intuitively appealing way through the incorporation of the prior odds. I will furthermore make some suggestions for reporting in this type of situation. Damage Assessment in Breach of Contract Litigation A recent suit in U.S. District Court alleged breach of contract by a third-party administrator of workers’ compensation claims. Applying a theory of lost economic opportunity, the plaintiff’s attorney retained experts in claims adjustment, insurance, and statistics to undertake an assessment of damages. A stratified, systematic random sample was drawn from the population of claims and subjected to a review of processing and costs. Several candidate ratio estimators were considered in estimating total damages. This paper describes in detail the sampling design and estimation methods used by statistical experts in the case and discusses how these procedures and results were received by the court. More aggressive application of the “loss of economic opportunity” or “loss of chance” doctrine will expand the scope of actions for which statistical assessments of damage may be indicated. Decision Analysis in Forensic Science Forensic scientists are routinely faced with the problems of making decisions under circumstances of uncertainty (i.e. to perform or not perform a test). A decisionmaking model in forensic science is proposed, illustrated with an example from the field of forensic genetics (paternity case). The approach incorporates available evidence and associated uncertainties with the assessment of utilities (or desirability of the consequences). The paper examines a general example for which identification will be made of the decisionmaker, the possible actions, the uncertain states of nature, the possible source of evidence, and the kind of utility assessments required. It is argued that a formal approach can help to clarify the decision process and give a coherent means of combining elements to reach a decision. More precisely, two questions are approached: (1) can we obtain a value supporting one of the proposed propositions in this scenario?, and (2) how can the laboratory or the customer take a rational decision on the necessity to perform blood tests after an estimate of possible values of likelihood ratio? Assessing the Legal Relevance of Bullet Lead Evidence: Did the NRC Misfire? According to a 2004 National Research Council Report, compositional analysis of bullet lead (CABL) has the potential to be a reasonably accurate way to determine whether two bullets could have come from the same "compositionally indistinguishable volume of lead" and "may thus in appropriate cases provide ... evidence that ties a suspect to a crime or in some cases evidence that tends to exonerate a suspect." However, the NRC report did not specify (at least not clearly) the circumstances under which it would be reasonable to infer that compositionally indistinguishable bullets came from the same source, such as the same box of ammunition. I will argue that the NRC report presents an inadequate and misleading analysis of the legal relevance of CABL evidence by failing to draw an adequate distinction between two evidentiary elements that David Schum and his colleagues have called reliability and diagnosticity, that the report's conclusions appear to rest on the fallacious assumption that one can ascribe meaning to a CABL match based solely on its reliability in the absence of information about its diagnosticity, and consequently that the report's analysis of the admissibility of CABL evidence under the Daubert standard missed the mark. Problematic DNA Evidence: Case Studies This paper will explore problems that can arise when assessing the probative value of contemporary STR test results and characterizing that value for a jury. Examples from several actual cases will be used to illustrate the ambiguity created by potential failure to detect all alleles, mixed samples, possible artifacts, multiple traces and multiple suspects. The influence of observer effects on test interpretation will be illustrated and discussed. The way forensic analysts characterized (or mischaracterized) this DNA evidence in reports and trial testimony will be examined and critiqued. The Evidentiary Value of Forensic Mitochondrial DNA Evidence There have been several recent successful attempts to introduce mitochondrial DNA (mtDNA) evidence into criminal trials as evidence that the suspect is included as a possible contributor of certain crime scene evidence. As with nuclear DNA, the courts require that the introduction of mtDNA evidence be accompanied by an associated statistic to put the relevance of the evidence into context. Notwithstanding that, unlike nuclear DNA, mtDNA is not considered a unique identifier, the frequency statistics presented can nonetheless be powerful evidence suggestive of the suspect’s involvement. These statistics are determined by reference to an mtDNA database developed and maintained by the Federal Bureau of Investigation (FBI) and the Scientific Working Group on DNA Evidence (SWGDAM). As of November 2004, the FBI/SWGDAM database has 5071 mtDNA sequences and is subdivided into fourteen “racial” categories ranging in size from 8 mtDNA sequences (“Pakistan”) to 1814 mtDNA sequences (“Caucasian”). The developers of the database have determined that genetic variation within groups is of minimal relevance and ignore the geographic origins and genetic variation of the mtDNA sequences placed as representative in the database. To generate frequency statistics, forensic scientists use the counting method – by counting the number of observations of the suspect’s mtDNA sequence in the FBI/SWGDAM database, calculating a frequency estimate, placing a 95% confidence interval around the frequency estimate, and reporting out the upper and lower values. Scientists, defense lawyers, and other observers have started to level criticisms of the FBI/SWGDAM mtDNA database that, if legitimate, render the statistics developed from them statistically invalid. The most complete enunciation of these criticisms occurred in an mtDNA admissibility hearing in United States v. Ida Chase in Washington, D.C. There, molecular anthropologists and other scientists concluded that the database –- and its sub-categories divided on racial lines -– are neither sufficiently large enough nor sufficiently representative to provide meaningful information. Specifically, studies of the “African-American,” “Apache,” and “Navajo” sub-databases conducted in connection with the hearing concluded that the FBI/SWGDAM mtDNA database does not adequately represent diversity of mtDNA haplogroups in those populations, because of a failure to understand and to account for historic and more recent migration patterns, geographic clustering of haplotypes, and consequent regional differences. The presenters will report out the differing views in the Ida Chase litigation and suggest that regional databases might be an effective, reasonable, and feasible alternative to the current FBI/SWGDAM national database for the generation of meaningful statistical information. Analysis of Drop-out Alleles Using Object-oriented Bayesian Networks We consider the possibility of a drop-out allele as a complicating feature to be accounted for in forensic analysis. A drop-out allele is an allele which is not recorded by the equipment. This can be due to a mutation in the primer binding region resulting in a DNA amplification failure or because the measuring apparatus is unable to record certain allele values. In these cases we call it a silent allele. A drop-out allele can also be caused by an error of the equipment; in this case we call it missed allele. In particular when an allele drop-out occurs, only one allele is amplified and the individual may appear to be homozygous. Silent alleles can be passed, by Mendelian inheritance, to the child. We can thus have false evidence of exclusion leading to conclude, for instance in paternity testing cases, that an alleged father is not the true father. We show how object-oriented Bayesian networks can be used to model paternity testing cases when accounting for silent alleles and missed alleles as well as for mutation (or any of their combination). We show, through many examples, that taking into account allelic drop-out has a remarkable effect in both paternity testing cases and in criminal identification cases. Moreover, we show that in paternity cases where in addition to the DNA profile of child, mother and putative father, that of a putative father's brother is given, accounting for silent and missed alleles can have (for certain patterns) a very strong effect on the likelihood ratio in favour of paternity; whereas if the drop-out allele is not taken into account, the additional information on the putative father's genotype is irrelevant. Griggs, Hazelwood, and Legal Perceptions of Fairness: The Role of Statistics on the Fairness Debate in American Discrimination Law Griggs v. Duke Power and Hazelwood School District v. United States are two seminal employment discrimination cases that recognized and foreshadowed the central role of statistical analysis in American discrimination law. In different ways, both decisions also signaled the beginning of significant, long-term debates about the meaning of fairness and equal treatment. Griggs began continuing discussions about issues such as the appropriateness of non-intent based theories in discrimination law and the balance between discriminatory effects and business purpose. Hazelwood set the stage for affirmative action and, hence, for one of the central battlegrounds for debates about fairness in American discrimination law. Evaluation of Glass Fragments Described by Elemental Content – Applications in Casework The main aim of the analysis of evidence in the form of glass fragments, those transferred to a suspect's clothes and that collected at the scene of crime, is comparison. The comparison aids the investigator/scientist in the relative likelihoods of the evidence under each of two propositions. The likelihood ratio which compares the probability of the measurements on the evidence assuming a common source for the crime scene and suspect evidence with the probability of the measurements on the evidence assuming different sources for the crime scene and suspect evidence is a well-documented measure of the value of the evidence. There has been done much work on this kind of evaluation of glass traces when one variable has been considered, e.g., the refractive index. Recently, a model considering more than one physical property of the evidence material was proposed. [C.G.G. Aitken, D. Lucy, Evaluation of trace evidences in the form of multivariate data, Appl. Statist. (2004), 53, Part 1, pp. 109-122; corrigendum 665-666.] Within this model a multivariate kernel density approach was adopted for modelling between objects distributions and a multivariate normal distributions was adopted for modelling within objects distributions. This model was successfully tested on the set of results obtained during analysis of 200 glass objects by means of the SEM-EDX method. Each glass object was described by the concentrations of eight elements. Moreover, a graphical method of estimating the dependence structure was employed to reduce the highly multivariate problem to several lower-dimensional problems. Two experiments were performed –- the first one (200 simulations) considered the proposition that the fragments have the same source and the second one (19,900 simulations) considered the proposition that the fragments have different sources. False positive answers were obtained in 5% of cases and false negative answers in 7%. Some practical applications of this approach will be presented as this approach has recently been introduced for casework in the IFR. Poster PapersComplex DNA Mixture Analysis with Object-oriented Bayesian Networks In the last decade there has been increasing interest in forensic identification problems in great part due to advances in forensic biology. There has thus been an increase in the complexity of the casework problems that need to be solved. Probabilistic expert systems are an extremely useful and versatile tool for solving complex cases of DNA mixtures. In this work we approach the problem of a DNA mixture in which different genetically related individuals might be involved. The criminal case we analyze involves not only inheritance in the DNA profiles of the suspect and the victims, but also mixtures from more than one trace. Our purpose is to perform the case analysis by applying object-oriented Bayesian networks (a resource available in the Hugin 6.4 software). The modular flexibility structure of the object-oriented networks makes it possible to analyse very complex cases. In the case studied here, two victims and one suspect belong to the same family, and DNA-mixture profiles from three different traces are also presented as evidence. In this kind of case we are confronted with additional computational complexities, and it is extremely important to find the best graphical representation of the problem. Litigation Support - And Why It Makes Everything Else Look Easy This presentation will include many and varied confrontations and controversies experienced over a lengthy career as a statistical consultant involved in litigation support work. Case studies and topics will cover a broad range of legal issues and disputes, including advertising claim substantiation, environmental conflicts, product performance (or lack thereof), Tax Court asset valuation, insurance claim settlements, and more. Some hopefully interesting - but somewhat perplexing - anecdotes illustrating the use, misuse and abuse (both real and perceived) of statistical methods will be the primary focus of the talk. Examples will include various applications of sampling techniques, experimental design, applied life data analysis, and basic probability concepts, and how they were viewed by academicians and attorneys representing both client and adversarial interests. The Use of a Combined Search, Statistic and Information Management System to Aid in the Rapid Identification of Human Remains The Armed Forces Institute of Pathology is composed of several different areas of pathology, one of which is the Armed Forces Medical Examiner System (AFMES). To assist the AFMES in recent death investigations, the Armed Forces DNA Identification Laboratory (AFDIL) was established to conduct identification as well as the re-association of human remains using DNA methodologies. Depending upon the circumstances of the incident, the human remains may be fragmented, charred, or decomposed, which can make traditional identification processes such as dental or fingerprint analysis impossible Therefore, AFDIL utilizes the widely accepted technique of florescent Short Tandem Repeat analysis (STRs) for forensic DNA analysis on human remains. The ability to generate, compile and sort DNA evidence has become essential in the DNA identification process. As a result of our participation in the DNA identification process for both military fatalities and civilian mass disaster, our laboratory developed a search and statistical program to manage the hundreds of DNA profiles generated. These programs are formatted in an easy to use, intuitive graphical user interface (GUI) and written using Borland Delphi version 7 and the Microsoft SQL Server 2000 database. The Mass Fatality Information Management System (MFIMS) easily stores and tracks all information associated with references for unidentified remains. In the case of a mass disaster incident, suitable family references are collected for the lost individuals. This sample volume can easily increase to the hundreds. The MFIMS system allows personal data of both the victim and the donor(s) of the family reference(s) to be maintained in a format where it is easily retrievable. The system also indicates, if a suitable reference has been obtained for a victim as well as which samples have been identified and associated with that victim. The Automated Statistics and Analysis Program (ASAP) is a combination search and statistical analysis program. Data generated via expert systems or genotyping analysis applications is easily imported into the searchable database. Within the ASAP module are pre-consensus and consensus functions that compare profiles analyzed independently and flags differences between the data sets for the purpose of review. Searches can be conducted on the basis of the type of reference information available. The search algorithms consist of; direct reference, one reference, either parent or a child, two children to find a parent, spouse and child to find a parent, both parents to find a child and sibling to sibling. Allele frequencies are available for Caucasian, Hispanic, Asian and African American populations. The probability can be calculated from only one chosen population or from all four populations. Additional populations can also be added into the system. Results are expressed in the form of either a random match probability, or likelihood ratios such as parentage indexes. All of the resultant information is printed in one easily comprehendible report, which includes the data, statistical analysis, case numbers and victim/reference names. The system is highly versatile so that it can be used on a small incident, such as a helicopter crash in support of Operation Iraqi Freedom, or on a much larger incident, such as the Sept. 11th attack on the Pentagon. A Maximum Likelihood Estimator of Pairwise Relatedness that Accounts for Evolutionary Effects The amount of relatedness between two individuals has been widely studied across disciplines. There are several cases in which accurate estimates of this quantity are important in the forensic arena. Perhaps the most common application is in the area of remains identification. In addition, there are several scenarios in which pairwise relatedness estimates may be required in the courtroom. For example, the defense may suggest that a relative of the suspect is the culprit of the crime. Many estimators of pairwise relatedness have been proposed over the years, however none account for the potential effects of evolution. Populations tend to have subpopulations within it, and individuals typically mate within their own subpopulation. A small amount of inbreeding will result, which in turn introduces an additional amount of relatedness between random individuals within a subpopulation. This extra amount of relatedness should be taken into account when estimating pairwise relatedness. The objective of this research is to develop a new maximum likelihood estimator of pairwise relatedness that accounts for evolutionary effects. We build upon the foundation provided by earlier work in the area. A simulation study compares this new estimator to previous approaches, using simulated populations with and without inbreeding. We also evaluate the new estimator using a real data set, obtained from the FBI. Voodoo Science, "Those People," CSI, and the Doubt Machine This paper uses vignettes from courtroom experiences and theory to discuss some of the following ideas popular conceptions and misconceptions about DNA, genetics and heredity, and forensics. Many times jury members seem to have a sort of magico-religious take on science and especially DNA that short circuits actual understanding of science but also allows for decisionmaking. DNA as a sort of magical kernal inside people connected to ties between heredity and social ills is also an important issue. Social problems like poverty or crime are often thought of and described as hereditary and "transmitted" through families in a biological way, and this impacts trial strategy and jury-room decisions. Through selected courtroom experiences, the paper examines the ways that the criminal justice system works to create, re-create and maintain a group of people that can be called "the sociobio-underclass." It explores jury culture and jury dynamics. What does physical evidence mean to juries? How do power differentials play out in the jury room? Where is the ethical line when it comes to describing complicated scientific concepts to juries? In other words, how much can you "dumb it down" without losing meaning? Specific vignettes include the following: (1) "voodoo science" -- interactions with "criminalists" in cases where scientific method falls by the wayside in the interests of achieving a predicted outcome; (2) "those people" -- gang "experts" and their profiling techniques that borrow heavily from scientific language and discourse to achieve respectability, as well as the diminished-capacity defense; (3) "the CSI syndrome" -- expectations jurors have about scientific evidence and the limits of what can be presented effectively and understood in a trial setting; and (4) "the doubt machine" -- the defense strategy of switching between using emotion to create doubt about "cold facts" and harnessing that same forensic science to benefit the defense. |
|