Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Take A QuizChallenge yourself and boost your learning! Start the quiz now to earn credits.
Take A QuizUnlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
Take A QuizPlease log in to access this content. You will be redirected to the login page shortly.
LoginGeneral Tech Bugs & Fixes 3 years ago
User submissions are the sole responsibility of contributors, with TuteeHUB disclaiming liability for accuracy, copyrights, or consequences of use; content is for informational purposes only and not professional advice.
No matter what stage you're at in your education or career, TuteeHUB will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
Please log in to access this content. You will be redirected to the login page shortly.
Login
Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.
Your experience on this site will be improved by allowing cookies. Read Cookie Policy
Your experience on this site will be improved by allowing cookies. Read Cookie Policy
manpreet
Best Answer
3 years ago
I have created a Wrapper around the Spark-Submit command to be able to generate real time events by parsing the logs. The purpose is to create a Real Time interface showing detailed progress of a Spark Job.
So the wrapper will look like this:
And the output will look like the following:
Internally, the SparkSubmitter class launches the spark-submit command as a subprocess.Popen process, and then iterators over the stdout stream and returns Events by parsing the logs generated by the process, like this:
This implementation works well with the Spark Standalone Cluster. But I am having a issue when running on a Yarn Cluster.
In the Yarn Cluster the "Spark Related Logs" are coming in the
stderr, instead ofstdout. So my class is not able to parse the spark generated logs because it is only trying to read thestdout.Question 1: Is it possible to read Popen's stdout and stderr as a single stream?
Question 2: As stdout and stderr are both Streams, is it possible to merge both the streams and read them as one?
Question 3: Is it possible to redirect all the logs to only stdout?