首页 > > 详细

data编程代写、Java程序语言代做

项目预算:   开发周期:  发布时间:   要求地区:
Objective:

Your task is to create a Java program that extracts/scrapes the data from the website for both TV shows and movies. The program should utilize parallelism to improve the efficiency of data fetching. Once the data is fetched, save single value fields to separate RDS databases for TV shows and movies, and store other data to DynamoDB. Additionally, implement a fallback mechanism where if saving to the database fails, the program should write the output to a CSV file. Ensure appropriate logging is implemented to print necessary information during the execution.

Requirements:

1. Data Fetching:
● Extract relevant information such as title, description, release date, genre, duration, cast, director, etc.
2. Data Storage:
● Save single value fields (e.g., title, release date) for TV shows to one RDS database and for movies to another RDS database. You will need to create the tables in the database shared.
● Store all other data (e.g., description, genre, cast) for both TV shows and movies to DynamoDB. The dynamo table will be created with a primary key.
● Implement batch processing for saving data to the databases to improve efficiency.
3. Fallback Mechanism:
● If saving to the database fails for any reason, write the output data to a CSV file and share the same.
● The CSV file should include all the fetched data fields for each TV show and movie.
4. Database Schema:
● Design appropriate database schemas for RDStables to store the single value fields for TV shows and movies separately. Document your schema designs.

5. Implementation:
● Write clean and well-documented Java code using the latest technology.
● Utilize parallelism (e.g., Java threads, ExecutorService) for data fetching to improve performance.
● Utilize libraries/frameworks such as Spring Boot for web scraping, database interaction, and HTTP requests.
● Implement batch processing for database operations using Spring Batch or similar technologies.
● Implement logging statements to print necessary information during execution.
6. Testing (Optional):
● Implement unit tests to ensure the correctness of your code.
● Test your program with various scenarios to handle edge cases gracefully.
7. Documentation:
● Write clear documentation explaining how your program works, including any assumptions made and potential limitations.
● Include setup instructions, usage examples, and any additional
information that may be helpful for someone reviewing your code.

Additional Notes:

● You are encouraged to use the latest Java technologies and libraries to demonstrate your knowledge and skills.
● You are required to upload all the source code with dependencies in your GIT and share the repository to us.
● When extracted, the code needs to be runnable on a new machine with Java 11+.
● Pay attention to efficient batch processing implementation for database operations to optimize performance.
● Ensure that the CSV output is well-formatted and contains all necessary data fields.
● Utilize parallelism effectively to improve the efficiency of data fetching.
● Implement logging statements strategically to provide insights into the program's execution flow.
● Deployment using docker (optional);

软件开发、广告设计客服
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 9951568
© 2021 www.rj363.com
软件定制开发网!