Academic Handbook Course Descriptors and Programme Specifications

LDSCI6209 Large-Scale Information Storage and Retrieval Course Descriptor

Course code LDSCI6209 Discipline Computer & Data Science
UK credit 15 US credit 4
FHEQ level 6 Date approved November 2022
Core attributes Analysing and Using Data (AD)
Pre-requisites (LCSCI5208 Database Design or LDSCI5247 Foundations of Data Science) and (LCSCI5205 Object-Oriented Design or LDSCI5206 Advanced Programming with Data)
Co-requisites None

Course Overview

Scalability is an essential quality of modern data storage and retrieval systems; specialised skills and knowledge are required to build systems that scale efficiently. This course introduces key principles and methods of data storage and retrieval for both structured and unstructured data in a distributed environment (e.g., distributed databases, key-value stores, or graph databases). It explores issues of data quality assurance, storage reliability, and scalability when working with large volumes of data. Students gain practical experience with modern frameworks, methods and techniques for large-scale information storage and retrieval

Learning Outcomes

On successful completion of the course, students will be able to:

Knowledge and Understanding

K1c Systematically understand fundamental aspects of the theory and practice of building scalable information storage and retrieval systems.
K2c Accurately identify scalability requirements and fundamental engineering principles, methods and tools to meet them when designing a large-scale information storage and retrieval system.
K3c Demonstrate detailed knowledge and systematic understanding of the capabilities and limitations of fundamental methods and techniques for building scalable information storage and retrieval systems.

Subject Specific Skills

S1c Critically evaluate the technical, management and social dimensions that underpin the development of scalable information storage and retrieval systems (for availability, security, privacy and so on).
S2c Develop high availability, high performance and secure information storage and retrieval systems based on best practices and industry standards.

Transferable and Employability Skills

T2c Review key developments in information storage and retrieval systems and analyse the effectiveness of changes made to a computer system to scale.
T3c Display an advanced level of technical proficiency in written English and competence in applying scholarly terminology, so as to be able to apply skills in critical evaluation, analysis and judgement effectively in a diverse range of contexts.
T4c Work in a proactive and effective manner as part of a team, exercising initiative and responsibility in managing, planning, and developing a large-scale information storage and retrieval system.

Teaching and Learning

This course has a dedicated Virtual Learning Environment (VLE) page with a syllabus and range of additional resources (e.g. readings, question prompts, tasks, assignment briefs, discussion boards) to orientate and engage students in their studies.

The scheduled teaching and learning activities for this course are:

Lectures/labs. 40 scheduled hours – typically including induction, consolidation or revision, and assessment activity hours:

  • Version 1:All sessions in the same sized group, or
  • Version 2: most of the sessions in larger groups; some of the sessions in smaller groups

Faculty hold regular ‘office hours’, which are opportunities for students to drop in or sign up to explore ideas, raise questions, or seek targeted guidance or feedback, individually or in small groups.

Students are to attend and participate in all the scheduled teaching and learning activities for this course and to manage their directed learning and independent study.

Indicative total learning hours for this course: 150

Assessment

Both formative and summative assessment are used as part of this course, with purely formative opportunities typically embedded within interactive teaching sessions, office hours, and/or the VLE.

Summative Assessments

AE: Assessment Activity Weighting

(%)

Duration Length

(words)

1 Set Exercises 40 24-32 hours  
2 Written Assignment 30   2,500
3 Written Assignment – Group 30   2,500

Further information about the assessments can be found in the Course Syllabus.

Feedback

Students will receive formative and summative feedback in a variety of ways, written (e.g. marked up on assignments, through email or the VLE) or oral (e.g. as part of interactive teaching sessions or in office hours).

Indicative Reading

Note: Comprehensive and current reading lists are produced annually in the Course Syllabus or other documentation provided to students; the indicative reading list provided below is for a general guide and part of the approval/modification process only.

  • Tim Peierls, Brian Goetz, Joshua Bloch, Joseph Bowbeer, Doug Lea, and David Holmes. 2005. Java Concurrency in Practice. Addison-Wesley.
  • George Coulouris, Jean Dollimore, Tim Kindberg, and Gordon Blair. 2011. Distributed Systems: Concepts and Design (5th. ed.). Addison-Wesley.
  • Martin Kleppmann. 2017. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly Media.

Indicative Topics

Note: Comprehensive and current topics for courses are produced annually in the Course Syllabus or other documentation provided to students; the indicative topics provided below are used as a general guide and part of the approval/modification process only.

  • Distributed data stores, including NoSQL databases, key-value stores and graph databases
  • Large-scale data processing systems
  • Complexity and hardness of large-scale data storage and retrieval algorithms
  • Serverless architectures
  • Replication, partitioning, and consistency in distributed data

Version History

Title: LDSCI6209 Large-Scale Information Storage and Retrieval

Approved by: Academic Board

Location: academic-handbook/programme-specifications-and-handbooks/undergraduate-programmes

Version number Date approved Date published Owner Proposed next review date Modification (as per AQF4) & category number
1.1 July 2023 September 2024 Dr Alexandros Koliousis November 2027 Category 1: Corrections/clarifications to documents which do not change approved content or learning outcomes
1.0 November 2022 January 2023 Dr Alexandros Koliousis November 2027