How I Got into Data Science

To me, data science is all about answering interesting puzzles and leveraging data for real-world impact.

The first time I was exposed to the potential for data to help answer questions that interested me was as an undergrad. As a student at Georgetown’s School of Foreign Service, all of the academic disciplines had an emphasis on international relations. During my first two years, I pursued my interest in Asian Studies. As a Chinese-American, I grew up in a multicultural home. My mom is ethnically Chinese, yet she was born in Peru and as a result had never learned to speak Chinese. This was an area that I was excited to get to explore in college. Yet after countless courses that were steeped in theory and qualitative methodologies used to make sense of how nations interact with one another, I found myself gravitating toward the field of political economy. This program at Georgetown equips students with the empirical data analytical tools to analyze issues lying at the nexus of economics and political science. One of the books that really galvanized my interest in this field was Freakonomics. I was fascinated by Steven Levitt’s creativity in how he used data to explore cheating in sumo wrestling and the Chicago public school system and loved how data could be used to uncover hidden behavior. My own research ended up centering around estimating the impact of corruption on economic development using firm-level data from the World Bank.

My first job out of college gave me a taste for how I could use my data skills to have a real-world impact. After graduating from Georgetown, I joined the Consumer Financial Protection Bureau, a federal agency that regulates financial markets. The Bureau was created in the aftermath of the 2008 financial crisis with a mandate of protecting the American people from exploitation by banks and other financial institutions. I was appointed to my role at CFPB through the Director’s Financial Analyst program, a fellowship program housed in the Office of the Director that embeds fellows with strategically important teams across the agency. I spent most of my time at CFPB working on the data science team. One of our main clients was the Bureau’s Office of Enforcement, the department that conducts investigations and brings actions against financial companies found to be violating federal regulations. My primary responsibility was assisting these teams of lawyers by supporting legal arguments with data. We had the authority to subpoena proprietary data from companies suspected of breaking the law. Along the way I taught myself Python and SQL. As a public servant, I saw the impact of our work on citizens across the country and contributed to investigations of companies like payday loan companies and online lenders. In the few years since the Bureau was established, we have already brought over $12 billion dollars of relief to consumers who were harmed. Working at a mission-driven organization like the CFPB, I felt privileged by the opportunity to use data to improve the lives of everyday people.

After my fellowship at the CFPB, I had a unique opportunity to pursue graduate studies overseas. I was awarded a fellowship to pursue a fully-funded Master’s degree at Peking University, the top university in China, on a new scholarship program called the Yenching Academy. The aim of Yenching is to bring 200 young scholars and leaders from around the globe to study together in Beijing for two years. I sought to apply my background working as a data scientist to explore new avenues of China-related research through data science techniques. My Master’s thesis developed a new methodology to better understand China’s bilateral relationships with other countries by tracking air pollution particulate matter readings. My approach exploited the relationship between daily pollution levels and the timing of diplomatic meetings in Beijing to quantify how the Chinese government views its relations with different countries. I also made use of variation in weather patterns and how it relates to air pollution as a natural experiment to further strengthen my findings. This project involved constructing an original dataset on political meetings occurring in Beijing over an 11-year period through web scraping, which allowed me to further develop my programming skills.

Since graduating and moving back to the United States, I find myself immersed once again in the data science world. I was recently awarded a fully-funded fellowship by the Flatiron School in Washington, DC to pursue a data science bootcamp designed for PhDs and Masters in STEM fields transitioning from academia to industry. I am thrilled by this opportunity to further develop my skillset in programming, statistics, machine learning and big data. Although I’m not exactly sure what direction I will ultimately pursue, I am excited to continue to grow as a data scientist equipped with the tools to tackle interesting puzzles in the world.