Data Science with AI
Yearlong Scope & Sequence
How do we responsibly turn social questions and lived experience into data-driven knowledge?
Rather than treating AI or data tools as shortcuts, the course emphasizes how knowledge is produced, constrained, and communicated. Students learn to think like researchers first and technicians second. The course is structured across two semesters, with the first focused on intellectual foundations and question formation, and the second focused on execution, analysis, and communication through a yearlong Data Science Final Project (DSFP).
Semester 1: Foundations — How Knowledge Is Produced
Semester Outcome: By the end of Semester 1, students complete a locked, research-ready Data Science Final Project (DSFP) Proposal. This proposal defines what they will study, why it matters, and how it can realistically be investigated with data.
Semester 1 emphasizes conceptual grounding, theoretical literacy, and disciplined question formation. Ambiguity is addressed early so that later work can proceed in a structured and procedural way.
Units 1 & 2 — Foundations of Systems and Power
(Weeks 1–12, gradually tapering)
Target Number of Weeks: 10
These units form the conceptual backbone of Semester 1, establishing the intellectual tools required to study society, institutions, and collective behavior using evidence rather than opinion. Extended time here provides conceptual “breathing room” and a shared vocabulary for later data-driven work. These foundations are revisited and applied throughout the year.
Core Themes:
- Social systems and institutions
- Power, authority, and legitimacy
- Norms, incentives, and constraints
- Group behavior and social organization
- How institutional design shapes outcomes
- What makes a question investigable in the social sciences
The emphasis is on analysis over civic participation, treating political and social systems as objects of study. The goal is college-level rigor, avoiding advocacy or moral instruction.
Unit 3 — Tech Stack & Applied AI (Foundational Literacy)
(Introduced Week 5, expands gradually)
Target Number of Weeks: 3
Introduced alongside foundational instruction around Week 5, this unit combines technical infrastructure and applied AI literacy into a single conceptual framework. It aims to orient students to the tools of social data science, explaining why they exist to reduce future cognitive load, not to train for mastery yet. The tone is demystifying and analytical, not promotional or futuristic.
Key Topics:
- Hardware concepts (CPU, GPU, memory, storage) and local vs. cloud computing
- Categories of analytical software and where AI fits in data workflows
- Constraints, tradeoffs, and limitations of AI
- Appropriate vs. inappropriate uses of AI in academic inquiry, emphasizing limits, bias, and human responsibility
This is not a programming unit, and technical proficiency is not the expectation. AI is presented as a tool within a larger workflow, not an authority or knowledge source.
Unit 4 — Phenomenology
(Introduced after Foundations conclude)
Target Number of Weeks: 2
Marking a conceptual pivot, this unit serves as a bridge between systems knowledge and research design. It is framed as an examination of lived experience versus recorded data—a study of what data captures, distorts, or omits. This prepares students to think critically about data before collecting it, grounding later ethical and methodological decisions in the DSFP.
Core Ideas:
- Subjective experience vs. quantitative representation
- Bias, perspective, and positionality
- The limits of measurement
- Why interpretation matters in social research
Unit 5 — The Social Data Science Process (Beginning with Stage 1: Ask Questions)
Target Number of Weeks: 3
This unit launches the Data Science Final Project (DSFP) framework. The focus is on Stage 1: Asking Good Investigative Questions, and distinguishing between simple curiosity and researchable questions. Semester 1 ends with question formation, not analysis, positioning Semester 2 as the space where tools and methods are applied deeply.
Key Components:
- Dear Data / Postcard Project: An applied activity framed as an application of phenomenology, with time for personal data collection and reflection.
- Introduction to the UN Sustainable Development Goals as framing lenses.
- Development of an initial investigative question for the DSFP.
- Early thesis or proposal formation.
Semester 2: Applications — Executing the Research
Semester Outcome: Students complete and communicate a full Data Science Final Project (DSFP), moving from data collection through analysis and public-facing communication.
Because questions are locked in advance, Semester 2 emphasizes procedural discipline, specification, and iterative reasoning.
Unit 6: The Data Science Process — Gather, Clean, Explore
(Weeks 19–27)
Target Number of Weeks: 9
With approved DSFP proposals in place, students shift from conceptual design to structured data work. This unit focuses on translating research questions into usable datasets through primary and secondary sources.
Students learn that data work is rarely glamorous but is foundational to credible analysis.
Core Focus Areas:
- Survey construction and question logic
- Primary data collection (class-generated surveys)
- Secondary data sourcing (e.g., Pew Research, Census data)
- Data cleaning, validation, and documentation
- Univariate and multivariate exploratory analysis
- Identifying missing data, bias, and quality issues
This unit is intentionally spec-driven and procedural, allowing for consistent pacing and clear expectations across projects.
Unit 7: The Data Science Process — Analyze, Model, Synthesize
(Weeks 28–32)
Target Number of Weeks: 5
In this unit, students move beyond description into explanation. Analysis is framed as an iterative reasoning process rather than a search for a single “correct” model.
Emphasis is placed on defensible insight, transparency, and theoretical grounding.
Key Practices:
- Identifying meaningful patterns and relationships
- Selecting appropriate analytical techniques
- Iterating on analysis based on findings
- Integrating results with sociological and institutional theory
- Recognizing limitations and avoiding overclaiming
- Treating models as tools for thinking, not answers in themselves
Students learn analytical judgment, restraint, and interpretive humility.
Unit 8: The Data Science Process — Communicate Results
(Weeks 33–36)
Target Number of Weeks: 4
The final unit centers on accountability through communication. Students translate their DSFP findings into forms appropriate for real audiences, recognizing that interpretation does not end with analysis.
Communication is treated as an extension of the research process, not a final add-on.
Possible DSFP Deliverables:
- Written DSFP research report
- Data visualizations and dashboards
- Formal oral or recorded presentations
- Executive summaries for non-technical audiences
- Reflections on limitations, ethics, and bias
- Personal DSFP Website
- Students build and launch a personal website to host their DSFP
- Sites may use a custom domain or a custom subdomain of *AiEmpoweredU.org*
- Websites serve as public-facing homes for research artifacts and analysis
- All sites link back to *AiEmpoweredU.org* as part of a shared academic network
Development Method:
Student websites and final artifacts are built using Spec-Driven Development, requiring students to define purpose, structure, constraints, and success criteria before implementation.
Final Outcome:
A completed Data Science Final Project (DSFP) that demonstrates:
- A well-formed investigative question
- Responsible data collection and preparation
- Thoughtful analysis and synthesis
- Interpretation grounded in theory
- Clear, public-facing communication of results