$30
SOEN 363: Data Systems for Software Engineers Assignment 3
Overview In this assignment, you create a NoSQL database of movies and their information. The movies data are directly extracted from assignment 2 and transferred into the NoSQL database. Implementation Platform We use Neo4J [1] in this assignment. While you may nd many tutorials online, attending the tutorials sessions are strongly recommended. For any help re: programming, or questions on the platform, please see PODs. 1 Data Transfer The data transfer is done by converting the data from each relation from the RDBMS into a csv (or tsv) or json data. You will then use the data and directly import it into Neo4j. https://neo4j.com/developer/guide-import-csv/ https://neo4j.com/labs/apoc/4.1/import/load-json/ Entities / Nodes In this assignment you creating the following entities (nodes) and populate the data. Movies (attributes: title, description / plot (full text), rating, release year, runtime, genres, and languages) Actors (rst name, last name) Countries Keywords Data Files and Scripts [15 pts] Extract the data from your database into the data le (csv, tsv, or json1 ). [40 pts] Write scripts to create the database in Neo4J. Note that Neo4J supports array attributes, which are normally represented using weak entities in relational model: https://neo4j.com/docs/cypher-manual/current/functions/list/. To populate the such data (i.e. genres, languages), you may use a separate csv le. 1Using JSON is not recommended, but is permitted. Naturally, a relational data may be directly represented using a tabular data format such as CSV, TSV, etc. 2 Queries Provide the answers to the following: A) [5 pts] Find all movies that are played by a sample actor. B) [5 pts] Find all movies that are released after the year 2000 and has a rating of at least 5. C) [5 pts] Find all movies that share two keywords of your choice. Make sure your query returns more than one movie. D) [10 pts] Find top 2 movies with largest number of keywords. E) [10 pts] Find top 10 movies (ordered by rating) in a language of your choice. F) [5 pts] Build full text search index to query movie plots. G) [5 pts] Write a full text search query and search for some sample text of your choice. Make sure all above queries return data. Modify the data in your database, if necessary. Submit your assignment electronically on Moodle: https://moodle.concordia.ca Include your name and student ID in the submission. Make sure that you upload the assignment to the correct assignment box on Moodle. No email submissions are accepted. Assignments uploaded to the wrong system, wrong folder, or submitted via email will be discarded and no resubmission will be allowed. Make sure you can access Moodle prior to the submission deadline. The deadline will not be extended. References 1. https://neo4j.com/ 3