Speak now
Please Wait Image Converting Into Text...
Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Challenge yourself and boost your learning! Start the quiz now to earn credits.
Unlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
Course Queries Syllabus Queries 2 years ago
Posted on 16 Aug 2022, this text provides information on Syllabus Queries related to Course Queries. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.
Turn Your Knowledge into Earnings.
I'm just looking for advice on how I can get my code to operate faster. It's pretty quick right now with searching through 30 3-page PDFs, but I imagine once there gets to be thousands of files to search that it will take longer than I'd like. I can change SearchOption.AllDirectories to TopDirectoryOnly. I've done some testing though and it seems like what takes the longest is the searching in the files not actually enumerating the directory.
SearchOption.AllDirectories
TopDirectoryOnly
public string ReadPdfFile(string fileName, String searchText) { List<int> pages = new List<int>(); if (File.Exists(fileName)) { PdfReader pdfReader = new PdfReader(fileName); for (int page = 1; page <= pdfReader.NumberOfPages; page++) { ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy(); string currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy); if (currentPageText.Contains(searchText)) { pages.Add(page); } } pdfReader.Close(); } if (pages.Count == 0) return null; else return fileName; } protected void txtBoxSearchPDF_Click(object sender, EventArgs e) { if (txtBoxSearchString.Text == "") { lblNoSearchString.Visible = true; } else { lblNoSearchString.Visible = false; var files = from file in Directory.EnumerateFiles(@"C:\schools\syllabus", "*.pdf", SearchOption.AllDirectories) select new { File = file, }; StringBuilder sb = new StringBuilder(); foreach (var f in files) { string fileNameOnly = string.Empty; string pdfSearchMatch = ReadPdfFile(f.File, txtBoxSearchString.Text); if (pdfSearchMatch != null) { string domainURL = Regex.Replace(pdfSearchMatch, @"C:\\schools\\syllabus", @"https://mywebsite.com/search/syllabus/"); string finalSyllabusURL REPLY 0 views 0 likes 0 shares Facebook Twitter Linked In WhatsApp
The major bottleneck is most likely in the ReadPdfFile method as we are dealing with a PDF file.
ReadPdfFile
PDF
In your ReadPdfFilemethod, a PdfReader is created to read through every page of the document to find the searchText and the page numbers on which the searchText is found is stored inside a List named pages.Once the reader ran through every page, the method returns null or the filename based on whether numbers of pages is 0.
searchText
List
pages
What you could do is to return as soon as you have found the text, so that you don't have to look through the entire document for nothing.
The method has been renamed to reflect more what it actually performs, and the return type has been changed to bool, since we only need to know if the file contains the search text.
bool
public bool SearchPdfFile(string fileName, String searchText) { /* technically speaking this should not happen, since "you" are calling it therefore this should be handled critically if (!File.Exists(fileName)) return false; //original workflow */ if (!File.Exists(fileName)) throw new FileNotFoundException("File not found", fileName); using (PdfReader reader = new PdfReader(fileName)) { var strategy = new SimpleTextExtractionStrategy(); for (int page = 1; page <= pdfReader.NumberOfPages; page++) { var currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy); if (currentPageText.Contains(searchText)) return true; } } return false; }
No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
Course Queries 4 Answers
Course Queries 5 Answers
Course Queries 1 Answers
Course Queries 3 Answers
Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.