Hi, I'm Sam. I enjoy using data to empower people.
I match transitioning military service members to tech jobs. I have the best job in the world: it's a hybrid between data science and operations. I'm responsible for designing data systems and writing algorithms to match veterans to jobs, and I'm also the client for those algorithms, creating a virtuous circle. I also serve as product manager for various data products that use machine learning and common sense to allow Shift's users to leverage large pools of data to make career decisions. Shift is a tech startup in Berkeley, California. Its primary investors include Andreessen Horowitz, Structure Captial, Expa.com, and Tim Ferriss.
CommonLit is a literacy nonprofit growing at 500k users/month; I helped make it data-driven. My work included creating business intelligence workflows, teaching CommonLit staff SQL, conducting A/B testing, building data visualizations, automating analytics workflows, and building psychometric models in R. My favorite project was probably designing and implementing a complex randomized controlled trial (involving blocking and clustering) to successful test the effectiveness of one of CommonLit's new features.
As my work at Shift ramped up, I've needed to ramp down my involvement with CommonLit -- but it's an awesome organization full of awesome people with whom I was fortunate to work.
At GoldenKey, my role was to derive value from data. I was a hybrid data architect, A/B testing lead, product manager, business intelligence analyst, and data product builder.
I also interned for GoldenKey (then called SoloPro) for three months while I was still in the Army. I wrote about the military-to-startup internship model in Inc. The experience made me wonder why there wasn't a more formal program for work-trial fellowships for transitioning service members. Three years later, that's what Shift is building.
I built teams, led teams, planned operations, and executed operations in the 101st Airborne Division and The Old Guard. Here are some highlights:
Majored in international relations, with a track in mechanical engineering and two years of Mandarin Chinese. Also suffered through lots of "fitness."
Grudgingly took lots of mandatory calculus/physics/chemistry/statistics courses; became thankful for them five years later.
5-month in-person crash course on machine learning using Python.
HBX CORe is essentially the first semester of an MBA program. I took three courses (business analytics, financial accounting, and economics for managers), delivered through HBS.
|Python||Primary language for scripting. Rapid prototyping for machine learning models, web applications, automating workflows, creating visualizations, and just about anything else I need it for. I generally prototype in Jupyter notebooks, then transfer work to proper modules.|
|SQL (PostgreSQL)||My first job out of the military involved trying to make business sense out of a messy OLTP database. I can turn just about any data question into a SQL query.|
|R||I generally use R over Python in two places. First, I'll use R when the majority of the task is standard/frequentist statistical analysis. For example, I find the analysis associated with conducting randomized controlled trials and A/B tests more straightforward in R than in Python. Second, I prefer R when the task at hand has better libraries in R than Python. One example is psychometric modeling; for some reason, all of the good psychometric/item response theory libraries (
|Machine Learning||I have broad experience with supervised/unsupervised methods across classification, regression, and clustering problem spaces. Where necessary, I'm comfortable chaining together different ML techniques to enrich the overall output. For example, I recently had one workflow that involved using a neural network to produce word embeddings, which I then clustered using k-means to find a particular type of word, which I then used to train a custom named entity recognition system, which I then used to make inferences (finding my custom entity in unseen text), the results of which I fed into an Latent Dirichlet Allocation model that provided value to users.|
|Field Experiments & Causality||I have experience designing and implementing hypothesis tests for causal inference that have spanned from simple UI optimization to complex designs with blocking, clustering, attrition, and spillover. I particularly enjoyed an RCT I implemented for CommonLit that to successful tested the effectiveness of a new features. Allocating a feature to one population but not another creates obvious UX concerns; in order to avoid the poor UX, there were some spillover and other experimental design considerations that my team and I needed to mitigate through statistical trickery.|
|Data Architecture & Data Engineering||I'm fairly adept at acquiring and organizing data, as no interesting machine learning work is possible without quality data. In my experience, the best way to improve as a data engineer is simply to work on more real-world data engineering problems... and I don't foresee any shortage of those in my future.|
|Big Data & Cloud Tools|
|HTML and CSS|
|Data / Engineering|
|DevOps and Frameworks|
|AWS EC2 / EMR / S3|
|Google Compute Cloud|
|Plotly & Bokeh|
|Additional Favorite Data Science Libraries and Tools|
I like to take pictures.
I post my photos on Unsplash.