Publications

Publications
Publications
We strongly believe in open source and giving to our community. We work directly with researchers in academia and seek out new perspectives with our intern and fellowship programs. We generalize our solutions and release them to the world as open source projects. We host discussions and publish our results.

Publications

In Diane Coyle, Wendy Alexander, and Brian Ashcroft, eds., New Wealth for Old Nations: Scotland’s Economic Prospects, Princeton: Princeton University Press, pp. 119-165. (2005)

Skill Policies for Scotland

James J.Heckman, Dimitriy Masterov

This paper argues that skill formation is a life-cycle process and develops the implications of this insight for Scottish social policy. Families are major producers of skills, and a successful policy needs to promote effective families and to supplement failing ones. We present evidence that early disadvantages produce severe later disadvantages that are hard to remedy.

We also show that cognitive ability is not the only determinant of education, labor market outcomes and pathological behavior like crime. Abilities differ in their malleability over the life-cycle, with noncognitive skills being more malleable at later ages. This has important implications for the design of policy. The gaps in skills and abilities open up early, and schooling merely widens them.

Additional university tuition subsidies or improvements in school quality are not warranted by Scottish evidence. Company sponsored job training yields a higher return for the most able and so this form of investment will exacerbate the gaps it is intended to close.

For the same reason, public job training is not likely to help adult workers whose skills are rendered obsolete by skill-biased technological change. Targeted early interventions, however, have proven to be very effective in compensating for the effect of neglect.

Keywords
Categories
Quarterly Journal of Economics 120, no. 1 (2005): 131-172

Profit Sharing and the Role of Professional Partnerships

Steve Tadelis, Jonathan Levin

When it is hard to assess product quality, firms will sub-optimally hire low ability workers. We show that organizing as a profit-sharing partnership can alleviate these problems.

Our theory explains the historical prevalence of profit sharing in professional service industries such as law, accounting, medicine, investment banking, architecture, advertising, and consulting, and the relative scarcity of profit sharing in other industries.

It also sheds light on features of partnerships such as up-or-out promotion systems, and on recent trends in professional service industries.( JEL codes: D20, D82, J33, J44, J54, L22.

Keywords
Categories
The Journal of Law and Economics, 48(1):1-39. (2005).

Labor Market Discrimination and Racial Differences in Premarket Factors

Pedro Carneiro, James J.Heckman, Dimitriy Masterov

We investigate the relative significance of differences in cognitive skills and discrimination in explaining racial/ethnic wage gaps. We show that cognitive test scores taken prior to entering the labor market are influenced by schooling. Adjusting the scores for racial/ethnic differences in education at the time the test is taken reduces their role in accounting for the wage gaps.

We also consider evidence on parental and child expectations about education and on stereotype-threat effects. We find both factors to be implausible alternative explanations for the gaps we observe. We argue that policies need to address the sources of early skill gaps and to seek to influence the more malleable behavioral abilities in addition to their cognitive counterparts.

Such policies are far more likely to be effective in promoting racial and ethnic equality for most groups than are additional civil rights and affirmative action policies targeted at the workplace.

Keywords
Categories
Journal of Machine Learning Research (JMLR), Volume 6, March 2005

A finite Newton method for fast solution of large scale linear SVMs

Sathiya Keerthi, Dennis DeCoste

This paper develops a fast method for solving linear SVMs with L2 loss function that is suited for large scale data mining tasks such as text classification. This is done by modifying the finite Newton method of Mangasarian in several ways.

Experiments indicate that the method is much faster than decomposition methods such as SVM(light), SMO and BSVM (e.g., 4-100 fold), especially when the number of examples is large. The paper also suggests ways of extending the method to other loss functions such as the modified Huber's loss function and the L1 loss function, and also for solving ordinal regression.

Keywords
in Laura B. Nielsen and Robert L. Nelson, eds., Handbook of Research on Employment Discrimination: Rights and Realities, New York: Springer. (2005)

Understanding the Sources of Ethnic and Racial Wage Gaps and Their Implications for Policy

Pedro Carneiro, James J.Heckman, Dimitriy Masterov

Previous studies show that controlling for ability measured in the teenage years eliminates young adult wage gaps for all groups except black males, for whom the gap is reduced by approximately three-fourths. This suggests that disparity in skills, rather than the differential treatment of such skills in the market, produces racial and ethnic wage differentials.

However, minority children and their parents may have pessimistic expectations about receiving fair rewards for their skills in the labor market and so they may invest less in skill formation. Poor schools may also depress cognitive achievement, even in the absence of any discrimination.

We find that the evidence on expectations is mixed. Although all groups are quite optimistic about the future schooling outcomes of their children, minority parents and children have more pessimistic expectations about child schooling relative to white children and their parents when the children are young.

At later ages, expectations are more uniform across racial and ethnic groups. Gaps in ability across racial and ethnic groups also open up before the start of formal schooling, and the different trajectories of Hispanic and black students indicate that differences in schooling cannot be the source of cognitive disparities. Finally, test scores depend on schooling attained at the time of the test.

Adjusting for differences in schooling attainment at the age the test is taken reduces the power of measured ability to shrink wage gaps for blacks, but not for other minorities.

We also document the presence of disparities in noncognitive traits across racial and ethnic groups. These characteristics have been shown elsewhere to be important for explaining the labor market outcomes of adults. This evidence points to the importance of early (preschool) family factors and environments in explaining both cognitive and noncognitive ability differentials by ethnicity and race.

Keywords
Categories
Conference Record of the Thirty-Ninth Asilomar Conference on Signals, Systems and Computers. pp. 545-549. 2005

Data-Pattern Discovery Methods for Detection in Nongaussian High-dimensional Data Sets

Cécile Levasseur, Ken Kreutz-Delgado, Uwe Mayer, Gregory Gancarz

Many important expert system applications depend on the ability to accurately detect or predict the occurrence of key events given a data set of observations. We concentrate on multidimensional data that are highly nongaussian (continuous and/or discrete), noisy and nonlinearly related.

We investigate the feasibility of data-pattern discovery and event detection by applying generalized principal component analysis (GPCA) techniques for pattern extraction based on an exponential family probability distribution assumption.

We develop theoretical extensions of the GPCA model by exploiting results from the theory of generalized linear models and nonparametric mixture density estimation.

Keywords
The Eighteenth Annual Conference on Neural Information Processing Systems (NIPS04), Vancouver, B.C., Canada, December 2004

Coarticulation in Markov Decision Processes

We investigate an approach for simultaneously committing to multiple activities, each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we define a set of admissible solutions consisting of the redundant set of optimal policies, and those policies that ascend the optimal state-value function associated with them.

A plan is then generated by merging them in such a way that the solutions to the subordinate activities are realized in the set of admissible solutions satisfying the superior activities. We present our theoretical results and empirically evaluate our approach in a simulated domain.

Keywords
The Twenty-First International Conference on Machine Learning (ICML04), Banf, Canada, July 4-8, 2004

Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data

Khashayar Rohanimanesh, Robert Platt, Sridhar Mahadevan, Roderic Grupen, Khashayar Rohanimanesh, Robert Platt, Sridhar Mahadevan, Roderic Grupen

In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist.

We present dynamic conditional random fields (DCRFs), a generalization of linear-chain conditional random fields (CRFs) in which each time slice contains a set of state variables and edges—a distributed state representation as in dynamic Bayesian networks (DBNs)—and parameters are tied across slices.

Since exact inference can be intractable in such models, we perform approximate inference using several schedules for belief propagation, including tree-based reparameterization (TRP). On a natural-language chunking task, we show that a DCRF performs better than a series of linearchain CRFs, achieving comparable performance using only half the training data.

Keywords
NIPS workshop on Syntax, Semantics, and Statistics, Vancouver, Canada, December 2003

Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences

Andrew McCallum, Khashayar Rohanimanesh, Charles Sutton, Andrew McCallum, Khashayar Rohanimanesh, Charles Sutton

Conditional random fields (CRFs) for sequence modeling have several advantages over joint models such as HMMs, including the ability to relax strong independence assumptions made in those models, and the ability to incorporate arbitrary overlapping features. Previous work has focused on linear-chain CRFs, which correspond to finite-statemachines, and have efficient exact inference algorithms.

Often, however, we wish to label sequence data in multiple interacting ways—for example, performing part-of-speech tagging and noun phrase segmentation simultaneously, increasing joint accuracy by sharing information between them.

We present dynamic conditional randomfields (DCRFs), which are CRFs in which each time slice has a set of state variables and edges—a distributed state representation as in dynamic Bayesian networks—and parameters are tied across slices. (They could also be called conditionallytrained Dynamic Markov Networks.) Since exact inference can be intractable in these models, we perform approximate inference using the tree-based reparameterization framework (TRP). We also present empirical results comparing DCRFs with linear-chain CRFs on natural language data.

Keywords

Pages