How to debug beam WriteToText running out of memory?

Question

TuteeHUB · Accepted Answer

{"id":941,"user_id":8,"qa_id":479,"answer_id":0,"answer":" 
The GroupBy during my WriteToText operations fails due to running out of memory which kills my dataflow job. Running the job locally I run out of memory as well.
Based on the WriteToText source code it seems to me specifying the number of shards should help with issue. I am not sure how to choose the number of shards though can anyone explain a process to choose the number of shards?
I expect a better sharding approach could mean the pipeline is less efficient but doesn't crash. In general I am not sure how to make dataflow pipelines more robust against failure from large outliers.
For a bit more context y error message on Dataflow looked like this:

Workflow failed. Causes: S31:ReadData/Read+BaseNLP+SplitBaseDoc+WriteJSONBaseNLPToGS/Write/WriteImpl/WriteBundles/WriteBundles+SplitSentences+NormalisedNESplitSentences+NamedEntitiesSplit+LinkedEntitiesSplit+ExtractMetadata+ExtractSentCoOcc+ExtractDocCoOcc+WriteJSONDocumentToGS/Write/WriteImpl/WriteBundles/WriteBundles+WriteDocCoOccToGS/Write/WriteImpl/WriteBundles/WriteBundles+WriteJSONDocumentToGS/Write/WriteImpl/Pair+WriteNamedEntitiesToGS/Write/WriteImpl/WriteBundles/WriteBundles+WriteNormalisedSentenceNEToGS/Write/WriteImpl/WriteBundles/WriteBundles+WriteNormalisedSentenceNEToGS/Write/WriteImpl/Pair+WriteNormalisedSentenceNEToGS/Write/WriteImpl/WindowInto(WindowIntoFn)+WriteNormalisedSentenceNEToGS/Write/WriteImpl/GroupByKey/Reify+WriteNormalisedSentenceNEToGS/Write/WriteImpl/GroupByKey/Write+WriteJSONDocToGS/Write/WriteImpl/WriteBundles/WriteBundles+WriteJSONDocToGS/Write/WriteImpl/Pair+WriteJSONDocToGS/Write/WriteImpl/WindowInto(WindowIntoFn)+WriteJSONDocToGS/Write/WriteImpl/GroupByKey/Reify+WriteJSONDocToGS/Write/WriteImpl/GroupByKey/Write+WriteDocCoOccToGS/Write/WriteImpl/Pair+WriteSentCoOccToGS/Write/WriteImpl/WriteBundles/WriteBundles+WriteSentCoOccToGS/Write/WriteImpl/Pair+WriteSentCoOccToGS/Write/WriteImpl/WindowInto(WindowIntoFn)+WriteSentCoOccToGS/Write/WriteImpl/GroupByKey/Reify+WriteSentCoOccToGS/Write/WriteImpl/

Popular Categories

How to debug beam WriteToText running out of memory?

Manpreet Singh

Answers (1)

manpreet Best Answer 2 years ago

Similar Forum

Which operating system you favour and why?

What are the most popular tech portals in India?

What are best technologies available today for education / aiding learning?

Explore Other Libraries

Online Exams

Question Bank

Career News

Feeds

Full Forms

Dictionary

Interview Question

Gigs

Quotes

Lyrics

Videos

Courses

Blogs

Tutorials

Forum

Educators

Corporates

Tools

Related Searches

Important General Tech Links

Join Our Community Today