Welcome to Flirting with Data. I have organized this blog around projects, which are made up of data and tools. Each post is from one of four categories: project, data, tools or aside. The asides tend more towards soliloquys, though. The main purpose of the blog is to keep all of my thoughts organized and in order for later reference. Hopefully there’s something here to help others organize their data analysis setup and maybe some useful data. Feel free to contact me if you’d like to contribute in any way (guest writer, contribute code, data or tools).
An end goal of this blog is to track projects in all phases. If anybody has a better template for managing projects on a blog, especially something fast and scrum-like, I’d love to hear. Although I’m not sure how formal I want to get with it. I don’t want any overhead, but I do want to have small, clearly defined chunks of work that could be done in a couple hours of free time here and there. Perhaps a position at a small liberal arts college somewhere is a solution. Cheap, happy labor; no professional feedback; mild peer review. There were reasons it took me- how many?- years to get my phd. And the full time job was one of them…
- Whole Genome Scan Algorithm Improvement
- In Progress; 20% towards goal
- worked through some linear algebra
- need to get draft algorithm implemented
- need to get candidate dataset for association study
- need some preliminary results for my version and current practice
- decide if theory pays off in this case
- Digital Image Enlargement Algorithm
- Research Not In Progress; 5% towards goal
- solicited feedback from friends in related (image processing) field
- ballparked the infrastructure needed for printing, storing, manipulating the relevant images; put together a collage in illustrator to test printing on the plotter (4′ wide “print”)
- KML Tools (google earth format)
- Research Started; Not In Progess; 20% towards goal
- wrote code in S language (R) to take various data formats as input (different lat/long representations in databases or spreadsheets) and write out the corresponding KML files
- would like to convert it to S4 OO code so that it’s more maintainable
- need to add support for everything other than points and lines
- Statistical Analysis Virtual Machine
- Not Started; dream stage
- I want to spend my time analyzing data, not configuring a new workspace every time I get a new computer or start a new job. Thus, I would like to set up a virtual machine that has my workspace set up, which I can clone and carry around with me.
- Fisheries Management/Convervation
- Portfolio Optimization
- Basically, how do I invest my meager savings for the long run?
- patents
- java
- Heterozygosity and humans
- opt ed
- Weather
- Random graphs; TSP
- Dynamical system estimation (non-linear regression)
- metabolic engineering
- fisheries modeling
- Academic Explorer
- Academic Thermometer
- Priors for spherical data (think earth habitat)- a cylinder is probably good enough, and a square is a decent approximation most of the time- it’s just not that elegant.
- Graph traverser
- Function Space Optimization (Chinese Train Problem)
- Tetrachromat test
- thesis pub [1] Senescence paper
- thesis pub [2] distances for annotations from ontologies
- Scalable S
- Sent out some emails; killed project
- It was brute force, and I ended up going with a more elegant bootstrap approach which is much less computationally intensive.



