Data Science with AI

Yearlong Scope & Sequence

How do we responsibly turn social questions and lived experience into data-driven knowledge?

Rather than treating AI or data tools as shortcuts, the course emphasizes how knowledge is produced, constrained, and communicated. Students learn to think like researchers first and technicians second. The course is structured across two semesters, with the first focused on intellectual foundations and question formation, and the second focused on execution, analysis, and communication through a yearlong Data Science Final Project (DSFP).

Semester 1: Foundations — How Knowledge Is Produced

Semester Outcome: By the end of Semester 1, students complete a locked, research-ready Data Science Final Project (DSFP) Proposal. This proposal defines what they will study, why it matters, and how it can realistically be investigated with data.

Semester 1 emphasizes conceptual grounding, theoretical literacy, and disciplined question formation. Ambiguity is addressed early so that later work can proceed in a structured and procedural way.

Unit 0–1: Computing, Artificial Intelligence, and the Course Tech Stack (Weeks 1–3)

The course begins by situating artificial intelligence within the broader field of computer science. AI is presented not as a chatbot or creative substitute for thinking, but as a computational subfield with specific capabilities, limitations, and ethical constraints.

Students develop a working understanding of how computing systems operate, how data moves through those systems, and where AI tools fit within that landscape. The course tech stack is introduced alongside explicit norms for tool usage, data storage, and academic responsibility.

Key Ideas:

  • AI as a subfield of computer science
  • Hardware, software, and data systems
  • Online vs. offline AI tools
  • Scientist AI vs. Agentic AI
  • Boundaries for appropriate AI use in academic research
  • How and where course work is created, stored, and submitted

This unit establishes the rules of engagement for the entire year before students rely on advanced tools.

Unit 2: Introductory Sociology (CLEP-Aligned) (Weeks 4–12)

This unit provides students with the conceptual vocabulary needed to think in variables, populations, institutions, and structures. Rather than surveying sociology for its own sake, the course treats sociological theory as a practical framework for explaining patterns and outcomes.

Content is aligned with CLEP Introductory Sociology expectations, with emphasis placed on transferability to data science and research design.

Core Focus Areas:

  • The sociological imagination
  • Culture, norms, and socialization
  • Institutions and social structure
  • Stratification and inequality
  • Power, systems, and social organization
  • Using theory to explain observable patterns

Throughout the unit, students are encouraged to connect sociological concepts to phenomena they observe in their own communities and in public datasets.

Unit 3: Phenomenology and Lived Experience (Weeks 13–15)

While sociology emphasizes structure and populations, this unit focuses on meaning, perception, and standpoint. Students explore phenomenology as a way of understanding how lived experience shapes interpretation, bias, and data collection.

This unit is critical for helping students recognize the limits of quantification and avoid treating data as a complete or neutral representation of reality.

Key Ideas:

  • Lived experience versus measured variables
  • Perspective, bias, and standpoint
  • What data can and cannot capture
  • The role of qualitative thinking in data science

By the end of this unit, students are better prepared to ask research questions that are both meaningful and responsible.

Unit 4: (Data Science Process Stage 1) Ask Questions (Weeks 16–18)

This unit serves as the intellectual bridge between foundations and application. Students synthesize computing knowledge, sociological theory, and phenomenological insight to design a feasible and researchable Data Science Final Project.

Rather than rushing into data collection, students slow down to clarify what they are studying, why it matters, and how it can realistically be investigated.

Dear Data Project (Unit 4 Launch):
Unit 4 begins with the Dear Data project, an accessible, human-centered introduction to the full Data Science Process. Students formulate a statistical investigative question about a pattern in their daily lives, collect and organize their own data over time, analyze it for variability and patterns, and communicate their findings through a hand-designed data visualization. Because the project is personal and low-stakes, students can focus on decision-making, ambiguity, and interpretation rather than technical tools.

Major Components:

  • Identifying a social phenomenon of interest
  • Narrowing broad interests into researchable questions
  • Using theory as a constraint on question design
  • Conducting an introductory literature review using academic databases (EBSCO)
  • Evaluating feasibility, scope, and data limitations

Semester 1 Final Artifact: The DSFP Proposal

  • A locked research question
  • A clearly defined population of interest
  • A theoretical framework
  • Proposed data sources or collection methods
  • Known constraints and limitations

Semester 2: Applications — Executing the Research

Semester Outcome: Students complete and communicate a full Data Science Final Project (DSFP), moving from data collection through analysis and public-facing communication.

Because questions are locked in advance, Semester 2 emphasizes procedural discipline, specification, and iterative reasoning.

Unit 5: (Data Science Process Stage 2) Gather, Clean, and Explore (Weeks 19–27)

With approved research questions in place, students shift to systematic data work. This unit focuses on translating conceptual questions into structured datasets through surveys and secondary data sources.

Students learn that data work is not glamorous but essential, requiring precision, documentation, and attention to quality.

Core Components:

  • Survey construction and question logic
  • Collecting and managing class-generated survey data
  • Working with secondary datasets (e.g., Pew Research)
  • Data cleaning and validation
  • Univariate and multivariate exploratory analysis

Once the question is fixed, this unit becomes intentionally procedural and spec-driven, allowing clear pacing and consistent expectations.

Unit 6: (Data Science Process Stage 3 & 4) Analyze → Model → Synthesize Loop (Weeks 28–32)

In this unit, students engage in iterative reasoning with data. Rather than pursuing overly complex models, the emphasis is on generating defensible insights and integrating analysis with theory and context.

Students learn that models are tools for thinking, not answers in themselves.

Key Practices:

  • Identifying meaningful patterns and relationships
  • Iterating on analysis based on findings
  • Using models cautiously and transparently
  • Synthesizing results with theoretical frameworks
  • Avoiding overclaiming and misinterpretation

This unit reinforces judgment, interpretation, and analytical humility.

Unit 7: (Data Science Process Stage 5) Communicate Results (Weeks 33–36)

The final unit focuses on accountability through communication. Students translate their DSFP findings into forms appropriate for real audiences, recognizing that how results are presented shapes how they are understood.

Communication is treated as an extension of the research process, not a final add-on.

Possible DSFP Deliverables:

  • Written DSFP report
  • Data visualizations and dashboards
  • Formal presentations
  • Executive summaries
  • Reflections on limitations, bias, and ethics
  • Personal DSFP website
    • Students build and launch a personal website to host their DSFP
    • Sites may use a custom domain or a custom subdomain of AiEmpoweredU.org
    • Websites serve as public-facing homes for research artifacts, analysis, and interpretation
    • All sites link back to AiEmpoweredU.org as part of a shared academic network

Development Method:
Student websites are built using the Spec-Driven Development method. Students define purpose, structure, content requirements, and constraints before implementation, mirroring professional data and software development workflows.

Final Outcome: A completed Data Science Final Project (DSFP) that demonstrates:

  • A well-formed research question
  • Responsible data collection and preparation
  • Thoughtful analysis and synthesis
  • Interpretation grounded in theory
  • Clear, public-facing communication of results