Does it make sense to parallelize machine learning algorithms as part of PhD research? [closed]

General Tech Learning Aids/Tools 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Learning Aids/Tools related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

I'm developing machine learning algorithms to aid in the diagnosis and prognosis of various cancers for my PhD. My lab is an Nvidia teaching center (CUDA).

My supervisor thinks that I need to also optimize ML by parallelizing it in CUDA. However, as I see it, a model is trained once and there is no need to train again. Testing a model is also not time consuming. My interests are in ML, not Parallel Processing.

1) Should I spend a large chunk of my time parallelizing with CUDA?
2) Is CUDA still a viable framework for research? 
3) In the world outside of research, will this make it easier to get a ML job?

profilepic.png
manpreet 2 years ago

From practise point of view, just sharing some thoughts. I don't have any research (PhD) type of experience, and your supervisor may communicate similar contents below.

And, paralleling calculation here, personally, I defined as single PC that utils graphical card GPU cores for calculation acceleration as oppose to clustering calculation.

Some Theoretical Thoughts

Sometimes, it worth building multiple models, or even, same model multiple times on the same data in ML world.

The example would be like:

  • using cross validation (same method multiple times) to have robust model output or parameters. Or
  • model ensemble with lots of weak learners (multiple models) to have better accuracy.

Those ML process are time consuming, and paralleling can help reducing the time.

Also, from the info provided, I assume your ML project is image recognition with GPU acceleration. But I've no idea what's the main purpose of the project. It could be developing/improving new ML methods, or comparing known ML methods to form a academy review. Whatever the case, I assume, the result should still achieve some certain accuracy level.

Hence, it would make sense to consider some efficient method (e.g. paralleling calculation) to accelerate the modelling process.

Some Practical Considerations

In practise, efficiency is very import. When you have a theoretically accurate model, but takes a long time to build models, it won't be acceptable.

You can step back to say let's come back to some simple quick model with less accuracy. But what's the point of ML when traditional methods have similar or better accuracy?

Personal Answer to Questions

1) Should I spend a large chunk of my time parallelizing with CUDA? Paralleling is useful to make the ML training quicker. And CUDA is a cool technique to learn and apply to ML. Just balance the time between main purpose of the project and time.

2) Is CUDA still a viable framework for research? This meant to be the first chapter of your report / essay / dissertation. At least, more and more ML uses deep learning, which can be much quicker with GPU involved.

3) In the world outside of research, will this make it easier to get a ML job? I cannot definitely say knowing CUDA/paralleling would make you a top player (as other aspects would also be considered). But when other candidates have similar background, one with paralleling calculation experience would stand out.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.