|
Janara Christensen
Carleton College Senior Computer Science Major christej <AT> carleton <DOT> edu |
About Me:
- I am an undergraduate computer science major at Carleton College in Northfield, MN. I am graduating in June 2008 and will hopefully be attending graduate school next year. My research interests center mostly around data mining; I am very interested in how data mining relates to natural language processing, privacy, and machine vision to name a few!
CV:
- My CV can be downloaded here.
Research:
- My Senior Integrative Exercise is a group research project and consists of researching and
developing methods of making support vector machines explainable. We are looking at the
structure of the dual SVM formula and writing a graphical user interface that charts the
most important support vectors for each test input vector. Later we will extend our results
to boosting and modify the traditional SVM formula to minimize the number of support
vectors for a particular test point. Our project advisor is Professor David Musicant.
- In June 2006, I joined
Professor David
Musicant's research team to work on the EDAM (Exploratory Data Analysis and Monitoring)
project. Through this interdisciplinary research initiative involving computer scientists
and atmospheric scientists at Carleton College and the University of Wisconsin at Madison,
I have learned basic data mining and database theory, studied clustering methods, learned
about representing high dimensional data, and researched new twists on supervised learning.
My most recent work on the EDAM project has been on supervised learning. The atmospheric
scientists were interested in predicting the amount of elemental carbon in an area by
measuring the element makeup of an individual atmospheric particle. On the surface this
problem appeared to be classic regression, however on closer examination, we saw that the
input variable (the element makeup of the individual particles, collected per second) was
of different granularity than the output variable (the amount of elemental carbon in the
atmosphere, collected per hour). Because the granularity was different, the output
variables corresponded to the aggregate output of several input data points. We were not
able to find any evidence that this problem had been examined before, so we developed ways
of modifying existing classification and regression algorithms to solve this new problem.
We modified classification and regression for k-nearest neighbor, support vector machines,
and neural networks. After working as part of a team formulating the specific modifications,
I conducted the SVM experiments. In the end we wrote a paper, "Supervised Learning by
Training on Aggregate Outputs," that was accepted by ICDM for publication as a long paper.
In October, I presented this paper at ICDM and a poster version of this paper at the Grace
Hopper Machine Learning Workshop 2007.
- I am also on a research project with Professor David Liben-Nowell and a professor from the Chemistry Department to examine tRNA molecules. This project has given me the chance to experience computational biology research. Right now we are looking at the distribution of the tRNA molecules and determining whether the number of tRNA molecules is limited by the differentiation space. So far, we have written programs to maximize the differentiation between the tRNA molecules.
Publications:
- David R. Musicant, Janara M. Christensen, Jamie F. Olson. "Supervised Learning by Training on Aggregate Outputs". Proceedings of the Seventh IEEE International Conference on Data Mining, IEEE Press, 2007. To be published. Copyright (C) 2007 IEEE.
Relevant Links:
- Carleton College
- Carleton College Computer Science Department
- ENCHILADA
- EDAM ENCHILADA at SourceForge
Last updated 01/03/2008