How to ingest large amount of logs in ASP.Net Web API

General Tech Bugs & Fixes 2 years ago

0 3 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (3)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

I am new to API development and I want to create a Web API end point which will be receiving a large amount of log data. And I want to send that data to Amazon s3 bucket via Amazon Kinesis delivery stream. Below is a sample application which works FINE, but I have NO CLUE how to INGEST large inbound of data and in What format my API should be receiving data? How my API Endpoint should look like.

 
 [HttpPost]
 public async void Post() // HOW to allow it to receive large chunk of data?
 {
        await WriteToStream();
 }

    private async Task WriteToStream()
    {
        const string myStreamName = "test";
        Console.Error.WriteLine("Putting records in stream : " + myStreamName);
        // Write 10 UTF-8 encoded records to the stream.
        for (int j = 0; j < 10000; ++j)
        {
        // I AM HARDCODING DATA HERE FROM THE LOOP COUNTER!!! 
            byte[] dataAsBytes = Encoding.UTF8.GetBytes("testdata-" + j);
            using (MemoryStream memoryStream = new MemoryStream(dataAsBytes))
            {
                    PutRecordRequest putRecord = new PutRecordRequest();
                    putRecord.DeliveryStreamName = myStreamName;
                    Record record = new Record();
                    record.Data = memoryStream;
                    putRecord.Record = record;
                    await kinesisClient.PutRecordAsync(putRecord);
            }
        }
    }

P.S: IN real world app I will not have that for loop. I want my API to ingest large data, what should be the definition of my API? Do I need to use something called multiform/datafile? Please guide me.

profilepic.png
manpreet 2 years ago

Here is my thought process. As you are exposing a API for the logging, your input should contain below attributes

  • Log Level (info, debug, warn, fatal)
  • Log message (string)
  • Application ID
  • Application Instance ID
  • application IP
  • Host (machine in which the error was logged)
  • User ID (for whom the error occurred)
  • Time stamp in Utc (time at which the error occurred)
  • Additional Data (customisable as xml / json)

I will suggest exposing the API as AWS lambda via Gateway API as it will help in scaling out as load increases.

To take sample for how to build API and use model binding, you may refer https://docs.microsoft.com/en-us/aspnet/web-api/overview/formats-and-model-binding/model-validation-in-aspnet-web-api


0 views   0 shares

profilepic.png
manpreet 2 years ago

I don't have much ntext">context so basically will try to provide answer from how I see it.

First instead of sending data to webapi I would send data directly to S3. In azure there is Share Access Token so you send request to you api to give you url where to upload file(there is many options but you can limit by time, limit by IP who can upload). So to upload file 1. Do call to get upload Url, 2. PUT to that url. Looks like in Amazon it called Signed Policy.

After that write lambda function which will be triggered on S3 upload, this function will be sending event (Again I dont know how its in AWS but in azure I will send Blob Queue message) this event will contain url to file and start position.

Write second Lambda which listens to events and do actually processing, so in my apps sometimes i know that to process N items it take 10 seconds so I usually choose N to be something not longer that 10-20 seconds, due to nature of deployments. After you processed N rows and not yet finished send same event but now Start position = Start position on the begging + N. More info how to read range

Designing this way you can process large files, even more you can be smarter because you can send multiple events where you can say Start Line, End Line so you will be able to process your file in multiple instances.

PS. Why I would not recommend you upload files to WebApi its because those files will be in memory, so lets say you have 1GB files sending from multiple sources in this case you will kill your servers in minutes.

PS2. Format of file depends, could be json since its the easiest way to read those files, but keep in mind that if you have large files it will be expensive to read whole file to memory. Here is example how to read them properly. So other option could be just flat file then will be easy to read it, since then you can read range and process it

PS3. In azure I would use Azure Batch Jobs


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.

tuteehub community

Join Our Community Today

Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.

tuteehub community