2

I’m not sure this is the right place for this, but I’m in a bit of a dilemma about next steps for my career and just wanted to outline my thinking in case anyone else has been in this position and has a bit of insight.

The crux of the matter is that I’m currently a entry-level software engineer at a big tech company trying to figure out how to get more involved with theoretical computer science/ applied math research and whether I should apply for grad school (PhD, masters, etc).

Some background: I’m a recent computer science bachelors graduate (class of 2022). Most of the classes I took in undergrad (after the necessary reqs) were theoretical CS flavored (advanced algorithms, cryptography, computational complexity, etc) along with an occasional math course (modern algebra, combinatorics). I joined my current big tech company back in August/September as a software engineer, primarily working on machine learning infrastructure. I’ve found the work reasonably interesting at times and have definitely been learning a lot about how to be a better engineer.

That being said, I don’t think this work is the right fit for me interest-wise in the medium/long-ish term. If I had no work-related obligations, I would be spending nearly all of my time exploring deep learning theory related research. To be explicit, this is about formulating a more well-understood theory of deep learning, potentially with concrete guarantees (as opposed to more experimental deep learning research that seems to be all the rage these days). Indeed, this is what I’m doing right now: nearly every morning I get up and spend a couple hours before work exploring recent literature, playing around with half-baked ideas, working through a textbook, etc. My weekends are almost entirely dominated by this. That being said, I could plausibly see myself getting interested in another theory-related topic: though the further it gets from theory work, the less likely I am to stay interested in it.

The main bottleneck I’m currently having is that I’m not able to spend as much time as I’d like to do research/learn. Part of the problem is that work often gets busy, but even on relatively easier weeks I still wish that I had even more free time to work on research/learning.

One important thing to note is that I have not trained extensively as a researcher. I have worked in a few research labs in undergrad (mostly experimental machine learning but also with a theoretical computer science professor, though we never got particularly far). I am skeptical about my current ability to perform well as a theoretical researcher without some extensive mentorship. I also expect that my math background is weak relative to most theory researchers. I can’t say for sure if research will interest me long-term, but I’m willing to test this for the next couple of years. I am fairly confident in my ability to return to a software engineering position if things don’t work out though.

At this point, I’m considering what my options might be for both the short and long term.

  1. Apply for a Phd program directly potentially with options of both industry or academic careers (and if push comes to shove, returning to become a software engineer again afterwards).
  2. Apply for a masters and do as much research as I can during that time to see if I truly enjoy doing it/ have the ability to do it well to potentially apply for a Phd afterwards.
  3. Try and find a way to get myself on a team at my company that is doing work more aligned with what I’m interested in long-term
  4. Try and find a way to do research with nearby research labs (reaching out/cold emailing professors, postdocs, phd students, etc)
  5. (4a?) Reaching out to previous professors who I worked with to try and continuing doing work with them.
  6. Quitting my job and spending as much as my time as possible doing research (while living on savings?)

These options aren’t necessarily mutually exclusive. I’d expect that if I went for 1 or 2, I would apply in Fall 2023 with the expectation that I would join in Fall 2024 (which is pretty annoying actually, I’m wondering if there are any opportunities with faster turnaround times).

With respect to 3, the company that I work at has research labs with a good number (>15?) researchers working in areas I would be super excited to work in (as well as many more that I would be generally enthusiastic but slightly less interested in). I did reach out to one of them to see if there could be any opportunity of doing part-time work with them, but the impression I got was that I need to significantly strengthen my math background before being able to contribute, which I felt was fair.

Sorry if this was a bit of a ramble, happy to clarify any details as needed. What am I currently overlooking when thinking about this dilemma? How would others approach this decision-making process? As a followup question: if I decide to apply for a PhD/master’s program in the next admissions cycle, what are the best things I can do in the next few months to strengthen my application?

3 Answers 3

4

Unless you are exceptional, the best way to get to do theoretical machine learning starts out with getting a Ph.D. Experience in the software field is a big plus in Computer Science, as much theoretical work includes implementation. If you enter a Ph.D. program and are not successful, but not catastrophically so, you will get an M.S. (in the US at least).

Studying on your own is difficult, since these days Machine Learning uses analysis (optimization), linear algebra, statistics, and newly developed Mathematics, among others. The Ph.D. will give you guidance on how much time to spend on each of this.

Working with a university team puts the cart before the horse, as you probably do not have a good base.

Some Ph.D. programs will admit for each quarter / semester.

Finally, a caveat: Many practitioners go through a hard time in the first years of their career and then like the idea of returning to school as in retrospect it seems a lot more fun than it actually was. Make sure to give it some time before you make a decision to know that this does not apply to you. A Ph.D. needs stamina.

On the other hand, while a Ph.D. might be a wash as far as life time earnings are concerned, you get to have more interesting work.

1

I recommend one of the first three options. Better yet: I recommend trying all of the first three options and seeing what sticks.

Apply for a Phd program directly potentially with options of both industry or academic careers (and if push comes to shove, returning to become a software engineer again afterwards).

I see very little downside to doing this. If you get in, it may be challenging to give up your job and return to a grad school stipend. But for theoretical research like you describe, a PhD is by far the most logical next step. And applying is not too painful; this will give you a better sense of what your options are.

Apply for a masters and do as much research as I can during that time to see if I truly enjoy doing it/ have the ability to do it well to potentially apply for a Phd afterwards.

If you can't get into a suitable PhD program now, this might be a good alternative. Especially if your company offers tuition reimbursement, and you are able to pursue this without quitting your job. This will let you explore your interest while strengthening your PhD application. You could even apply for both this and PhD programs and see what you get.

Try and find a way to get myself on a team at my company that is doing work more aligned with what I’m interested in long-term...[I asked about this and] the impression I got was that I need to significantly strengthen my math background before being able to contribute.

If your company offers tuition reimbursement, you may be able to pick-and-choose some suitable math classes that will better qualify you to do this sort of work. "Pivoting" in this way is great, since you wouldn't have to take time off to go back to school. But it'd be a little bit unusual to build a theoretical career with only a BS + a few random classes; I suspect you'll eventually need/want at least a master's.

Try and find a way to do research with nearby research labs (reaching out/cold emailing professors, postdocs, phd students, etc) (4a?) Reaching out to previous professors who I worked with to try and continuing doing work with them.

Maybe, but what do you have to offer the professor? Free software engineering, maybe, but I don't think this is a long-term solution. It could work in the short term as a way to strengthen your PhD application.

Quitting my job and spending as much as my time as possible doing research (while living on savings?)

Financially risky, and it's hard to make progress without an advisor. And even if you do make progress, it's hard to "sell" it in many cases.

0

Check the ML- Collective ( https://mlcollective.org/ ).

They provide an entry point for ML researchers. Copying from their website:

ML Collective (MLC) is an independent, non-profit organization with a mission to make machine learning (ML) research opportunities, including collaboration and mentorship opportunities accessible and free for all. We execute our mission via two broad efforts: (1) community building, with open platforms that allow people to connect and collaborate, governed by recurring events and meetups that provide a structure for growth; and (2) research training, where we adopt a peer-mentoring model: researchers simply come on their own accord and meet on a regular cadence to help move projects forward.

Not the answer you're looking for? Browse other questions tagged .