Big Data Security & Privacy — Exam Study Reference

00 Exam Focus (First 50 Marks) 01 Security Fundamentals (CIA Triad) 02 Attacks: Passive, Active & Malware 03 Cryptographic Tools 04 Authentication & Access Control 05 Internet Security Protocols 06 NIST Big Data Framework (SP 1500-4r2) 07 Big Data & the V-Characteristics 08 Cloud Computing & Security 09 Cloud Security Threats & Countermeasures 10 IoT: Architecture & Components 11 IoT Security 12 Exam-Matched Practice

🎯 Exam Focus — First 50 Marks

BASED ON PROFESSOR SPOILER — APRIL 6, 2026

Exam weighting that matters here Q1 = 30 marks → 15 MCQs from Lecture 01: Review of Concepts / security fundamentals.
Q2 = 20 marks → essay question from Cloud Security + IoT Security.
Q3 = 50 marks is seminar-based and intentionally ignored in this sheet revision.

Highest Priority

Study Sections 1-5 first. That is the strongest match for the 15 MCQs on fundamentals: CIA, vulnerabilities, attacks, malware, crypto, authentication, access control, TLS/HTTPS/IPSec.

Second Priority

Study Sections 8-11 as essay material. Be ready to explain cloud service/deployment models, 5 actors, top threats, IoT layers, fog computing, constrained devices, and gateway security.

Lower Priority for First 50

Sections 6-7 (NIST Big Data Framework and V-characteristics) remain useful background, but based on the spoiler they are not the main target for the first 50 marks.

Fast Study Order

First pass: Sections 1-5 + the 15 MCQs in Section 12
Second pass: Sections 8-11 + the essay blueprints in Section 12
Last pass only if time remains: NIST Sections 6-7

🔐 Security Fundamentals — The CIA Triad

LECTURE 01 — REVIEW OF CONCEPTS

Confidentiality

Preserving authorized restrictions on information access and disclosure. Only those authorized should access the data.

Integrity

Guarding against improper modification or destruction. Includes ensuring non-repudiation and authenticity.

Availability

Ensuring timely and reliable access to and use of information when needed by authorized users.

Relationship Map Security Concepts Relationship

Threats exploit vulnerabilities to harm assets. Countermeasures protect assets through prevention, detection, and recovery, leaving only residual risk.

Vulnerabilities, Threats & Attacks

Vulnerability categories: Corrupted (loss of integrity), Leaky (loss of confidentiality), Unavailable or very slow (loss of availability).

Threats are capable of exploiting vulnerabilities and represent potential harm to assets. When threats are carried out, they become attacks.

Attack Classification

Passive vs. Active
Insider vs. Outsider

Countermeasures

Prevent, Detect, Recover
May introduce new vulnerabilities
Goal: minimize residual risk

Computer Security Strategy

Security Policy

A formal statement of rules and practices that specify or regulate how a system or organization provides security services to protect sensitive and critical resources.

Security Implementation

Four complementary courses of action: prevention, detection, response, and recovery.

Assurance

The degree of confidence that technical and operational safeguards actually work as intended to protect the system and its information.

Evaluation

The process of examining a computer product or system against specific criteria in order to judge its security properties.

Lecture 01 MCQ hotspots High-yield terms from the deck: security policy, assurance, evaluation, vulnerability categories, passive vs active attacks, DDoS/botnet, digital envelopes, MAC vs digital signatures, authentication factors, DAC/MAC/RBAC/ABAC, S/MIME, TLS, HTTPS, and IPSec services.

⚔️ Attacks: Passive, Active & Malware

LECTURE 01 — REVIEW OF CONCEPTS

Passive Attacks

Attempt to learn or make use of information without affecting system resources. Hard to detect because nothing is altered. Focus is on prevention, not detection.

Release of Message Contents

Eavesdropping on transmissions to read the content of a message (e.g., email, file transfer).

Traffic Analysis

Even if messages are encrypted, an attacker can observe patterns — frequency, length, source/destination of messages.

Active Attacks

Involve modification of the data stream or creation of a false stream. Four categories:

Masquerade

One entity pretends to be a different entity to gain unauthorized privileges.

Replay

Capturing data units and retransmitting them to produce an unauthorized effect.

Modification of Messages

A portion of a legitimate message is altered, or messages are delayed/reordered.

Denial of Service (DoS)

Prevents or inhibits normal use of communication facilities. Flooding, blocking, disrupting.

Denial of Service (DoS) — Deep Dive

A form of attack on availability. Resources that can be attacked include:

Target	Description
Network Bandwidth	Overwhelm the capacity of the link connecting server to Internet/ISP
System Resources	Overload or crash the network handling software
Application Resources	Valid requests consuming heavy resources, starving other users

DoS techniques: Source address spoofing, SYN spoofing, ICMP/UDP/TCP SYN/HTTP flooding, Slowloris (HTTP requests that never complete), Reflection attacks (through intermediaries).

DDoS (Distributed DoS) Uses multiple compromised systems (zombies) forming a botnet under an attacker's control to generate massive coordinated attacks. Attacker exploits OS/application flaws to install programs on many machines.

Malware Classification

Category	Needs Host?	Replicates?	Examples
Viruses	Yes (parasitic)	Yes	Boot sector, macro, polymorphic
Worms	No (independent)	Yes	Network worms, email worms
Trojans	No (independent)	No	Backdoors, RATs
Bots	No (independent)	—	Zombie machines in botnets

🔑 Cryptographic Tools

LECTURE 01 — REVIEW OF CONCEPTS

Fast exam memory If the question is about secrecy, think encryption. If it is about detecting change, think hash, MAC, or signature. If it is about proving who sent it, think MAC or digital signature. If it is about preventing later denial, the answer is digital signature.

What Each Crypto Tool Answers

Confidentiality

"How do we keep the content secret?" Use encryption so outsiders see ciphertext instead of plaintext.

Integrity

"Did anyone change the message?" Use hashes, MACs, or signatures to detect tampering.

Authentication

"Who really sent this?" Use MACs or digital signatures to bind a message to a sender.

Non-Repudiation

"Can the sender deny it later?" A digital signature is the main tool that prevents denial.

1. Symmetric Encryption

One shared secret key is used for both encryption and decryption. It is fast and efficient, but both sides must already share that secret key securely.

Crypto Flow Symmetric Encryption Model

1. Plaintext

The readable message starts here.

2. Encrypt with shared key K

Sender applies the algorithm using the secret key both sides already share.

3. Ciphertext

The message becomes unreadable to outsiders.

4. Decrypt with the same key K

Receiver uses that same secret key to recover the plaintext.

Symmetric encryption is efficient because the same secret key is reused. Its weak point is key distribution: both parties must get the key without exposing it.

Two lecture categories of symmetric ciphers:

Block Cipher

Processes plaintext in fixed-size blocks such as 64 or 128 bits. Lecture examples: DES, 3DES, AES.

Stream Cipher

Encrypts data one bit or byte at a time using a pseudorandom keystream. Lecture example: RC4.

Best Way to Remember It

Symmetric = fast, but sharing the key is hard. That is the core tradeoff to say in an oral answer.

2. Public-Key (Asymmetric) Encryption

Each user has a public key that can be shared openly and a private key that must stay secret. This solves the key-sharing problem better than symmetric encryption, but it is slower.

Public Key for Confidentiality

Sender encrypts with the recipient's public key. Only the recipient's private key can open the message. Goal: confidentiality.

Private Key for Signing

Sender uses the private key to create a signature, and others verify with the sender's public key. Goal: authentication, integrity, and non-repudiation.

Question	Symmetric	Asymmetric
How many keys?	One shared secret key	Public key + private key pair
Main advantage	Very fast	Easier key distribution
Main drawback	Securely sharing the key is difficult	Slower and more computationally expensive
Typical role in real systems	Bulk data encryption	Key exchange, signatures, certificates

Important distinction Lecture slides may describe this as "encryption with the private key," but in practice that idea appears as a digital signature. Usually the sender signs a hash of the message, not the whole message.

3. Digital Envelope (Hybrid Encryption)

Real systems often combine both methods. They use symmetric encryption for speed and public-key encryption only to protect the temporary symmetric key.

Hybrid Flow Digital Envelope

1. Generate session key Ks

Sender creates a fresh temporary symmetric key for this message.

2. Encrypt the message with Ks

This is the fast bulk-encryption step.

3. Encrypt Ks with the receiver's public key

Only the receiver's private key can recover that session key.

4. Receiver recovers Ks, then decrypts the message

This combines asymmetric key distribution with symmetric speed.

This is why hybrid encryption is so common: public-key cryptography fixes the key-distribution problem, while symmetric cryptography does the heavy data encryption efficiently.

4. Hashes, MACs, and Digital Signatures

These tools are mainly about integrity and authentication, not about hiding the message content.

Tool	Secret Needed?	What It Mainly Gives	Non-Repudiation?
Hash function	No	Integrity check only	No
MAC	Yes, shared secret key	Integrity + authentication between two parties	No
Digital signature	Yes, sender private key	Integrity + authentication + public verifiability	Yes

Message Authentication Code (MAC)

Uses a shared secret key together with a hash function. Both sender and receiver can create or verify it. It gives authentication and integrity, but not non-repudiation because both sides know the same secret.

Digital Signature (Hash-based)

Hash the message, then sign that hash with the sender's private key. The receiver verifies with the sender's public key and compares the digest. This gives authentication, integrity, and non-repudiation.

Common confusion A plain hash by itself does not prove who sent a message. Anyone, including an attacker, can hash modified data. Adding a shared secret gives a MAC. Using a private key gives a digital signature.

5. Public-Key Certificates

A certificate binds a public key to a real identity. It is issued by a Certificate Authority (CA), and the CA signs the certificate with its own private key so others can verify that binding.

Trust Chain How a Certificate Builds Trust

1. Alice has a public key

By itself, that key is just data. Others still need to trust who it belongs to.

2. The CA signs the binding

The certificate says, in effect, "this public key belongs to Alice."

3. Bob verifies with the CA public key

If Bob trusts the CA, he can trust Alice's certified public key.

The certificate does not hide data. Its job is to make a public key trustworthy by linking it to an identity through the CA's signature.

🪪 User Authentication & Access Control

LECTURE 01 — REVIEW OF CONCEPTS

Means of Authentication (4 Factors)

Factor	Description	Examples
Something you know	Knowledge-based	Password, PIN, security questions
Something you have	Token-based	Smartcard, electronic keycard, physical key
Something you are	Static biometrics	Fingerprint, retina, face
Something you do	Dynamic biometrics	Voice pattern, handwriting, typing rhythm

Remote User Authentication More complex due to additional threats: eavesdropping, password capture, replay attacks. Generally uses challenge-response protocols to counter these threats.

Access Control Policies

DAC — Discretionary

Based on identity of the requestor and access rules (authorizations). The owner of the resource decides who can access it.

MAC — Mandatory

Based on comparing security labels (e.g., Top Secret, Secret) with user security clearances. System-enforced, not owner-discretionary.

RBAC — Role-Based

Based on roles that users have within the system. Access is assigned to roles, users are assigned to roles. Simplifies large-scale management.

ABAC — Attribute-Based

Based on attributes of user, resource, and environment (e.g., time of day, location, department). Most flexible and fine-grained model.

Trust Chain Access Control Context

Authentication establishes identity, authorization applies policy to that identity, and auditing records the resulting actions for review and accountability.

🌐 Internet Security Protocols

LECTURE 01 — REVIEW OF CONCEPTS

S/MIME — Secure Email

MIME extends the old RFC 822 email format to support multimedia content. S/MIME adds security: ability to sign and/or encrypt email messages. Based on RSA technology. Provides confidentiality (encryption), authentication & integrity (digital signatures).

SSL/TLS

One of the most widely used security services. A general-purpose security layer implemented on top of TCP. SSL evolved into the Internet standard TLS (RFC 4346).

Protocol Stack SSL/TLS Protocol Stack

Handshake, alerts, and application data are all carried by the TLS Record Protocol, which runs on top of TCP/IP.

The TLS Record Protocol provides: fragmentation → compression → add MAC → encrypt → append TLS record header.

HTTPS

Combination of HTTP and SSL/TLS. URLs begin with https://. The HTTP client acts as the TLS client. Closure requires TLS close, then TCP close.

IPSec (IP Security)

Operates at the network layer. Provides: authentication, confidentiality, and key management.

Applications: Secure branch office connectivity, secure remote access, extranet/intranet connectivity, enhanced e-commerce security.

📊 NIST Big Data Interoperability Framework

LECTURE 02 — NIST SP 1500-4r2

What is NBDIF? The NIST Big Data Interoperability Framework (NBDIF) is a series of 9 volumes. Volume 4 (SP 1500-4r2, Version 3, Oct 2019) focuses specifically on Security and Privacy in Big Data.

What is Different About Big Data Security & Privacy?

The NIST subgroup identified 8 key differences from traditional implementations:

1. Heterogeneous Components

BD projects often have components where a single security scheme was not designed from the outset.

2. Streamed + At-Rest Data

BD increasingly involves one or more streamed data sources used together with data at rest — creating unique security scenarios.

3. Multi-Source Privacy Risk

Using multiple sources not originally intended together can compromise PII de-identification. Fusion of datasets exacerbates re-identification risk.

4. IoT Sensor Explosion

Huge increase in sensor streams (smart devices, cities, homes) creates vulnerabilities in connectivity, transport, and aggregation.

5. Commodity Big Data Sources

Data types once too big (geospatial, video) becoming commodity BD sources — often without security measures anticipated.

6. Veracity & Jurisdiction Magnified

Issues of context, provenance, and jurisdiction are greatly magnified. Multiple organizations, governments, citizens affected.

7. Data Permanence (Volatility)

BD envisions data as permanent by default. Data may outlive the security measures designed to protect it.

8. Cross-Org Data Sharing

Data/code shared across organizations, but standards assume single-org management. Small teams can create valuable BD with less governance.

Other Potential Differences

Inter-organizational issues: federation, data licensing
Mobile/geospatial increases deanonymization risk
No archive/destroy lifecycle — data lives forever
BD as technology accelerator for security: blockchain, NoSQL, ML-based intrusion detection
Transborder data flows across national boundaries
Consent frameworks via smart contracts / blockchain
Risk management shifts to inter-organizational focus
DevOps/agile with small teams (even single-developer)

Overview of Requirements

Rapid Responses

BD on public cloud with diverse hardware/OS/software. Streaming cloud technology demands extremely rapid security responses.

New Approaches

Actor/role-based BD system representations require different security facets. Approaches will evolve with the BD landscape.

Standardization

BD used across diverse industries (healthcare, finance, marketing). Effective cross-industry communication requires standardized security/privacy terms.

What is New — Key Points

Unprecedented mix of human and device actor types → new threat vector combinations
Data aggregation/dissemination must be secured in a formal framework
Search and selection of data accentuates privacy concerns
Privacy of PII must be protected at every stage (end-to-end)
Governance is becoming an intrinsic design consideration
Legacy security (auth, ACL) must be retargeted to BD HPC resources
Information Assurance & Disaster Recovery need unique practices at extreme scale
BD systems are concentrated, high-value targets for adversaries
Emerging risks in open data: data identification, metadata tagging, aggregation may degrade veracity

Security & Privacy Taxonomies

NIST Taxonomy Conceptual Taxonomy (4 Pillars)

The conceptual taxonomy groups security and privacy concerns into four big themes: confidentiality, provenance, system health, and policy or governance concerns.

NIST Taxonomy Operational Taxonomy (5 Domains)

The operational taxonomy turns the big conceptual themes into domains that can be implemented, monitored, and governed inside a real big-data deployment.

NIST Big Data Reference Architecture (NBDRA)

The NBDRA defines the key components and their interactions in a Big Data system:

NIST Architecture NBDRA — Key Components

In the NBDRA, security and privacy are not a separate box. They form a fabric that surrounds and influences the provider, application, framework, consumer, and orchestration roles.

Security & Privacy Fabric The fundamental idea: security and privacy are not a separate component but a fabric that permeates ALL components of the NBDRA. It spans from Data Provider through Application Provider, Framework Provider, to Data Consumer.

Security & Privacy Overlay — Details

Component	Security Functions
Data Provider	End-point input validation, real-time security monitoring, data discovery & classification, secure data aggregation
Application Provider	Data-centric security (identity/policy-based encryption), policy management for access control, computing on encrypted data (homomorphic encryption), granular audits, granular access control
Framework Provider	Securing data storage & transaction logs, key management, security best practices for non-relational data stores, security against DoS, data provenance
Data Consumer	Privacy-preserving data analytics & dissemination, compliance with regulations (HIPAA etc.), government access and freedom of expression concerns

📐 Security Impacts on Big Data V-Characteristics

LECTURE 02 — NIST SP 1500-4r2

Volume

Size: GB → exabytes+. Multi-tiered storage introduces threats: confidentiality/integrity, provenance, availability, consistency, collusion attacks, roll-back attacks, recordkeeping disputes. Flip side: analytics on volumes of data can help detect security breaches.

Velocity

Batch or continuous streaming. Distributed frameworks not designed with security in mind. Risks: malfunctioning nodes leaking data, partial infrastructure attacks, rogue nodes eavesdropping if strong authentication is absent.

Variety

Structured, semi-structured, unstructured. Retargeting relational DB security to non-relational = challenge. Encryption hinders semantic organization. Big Data variety allows inferring identity from anonymized datasets — attribute combinations enable re-identification.

Veracity (3 sub-aspects)

Provenance: Understanding original source, consent, intended use, chain of custody
Curation: Fixes errors, fills gaps, models data — binds veracity to governance & quality
Validity: Accuracy/correctness for application. Risk of click fraud, misinterpreted social media

Volatility

How data structures change over time. BD data may be permanent by default — outliving its creators and security measures. Roles/governance shift as organizations merge or disappear. Temporality must be considered.

Effects of Cloud Computing on BD Security

Broad network access — exposed to more threats
Decreased visibility/control by consumers
Dynamic system boundaries and shifting responsibilities
Multi-tenancy — different orgs share infrastructure
Data residency — where is your data physically?
Measured service — usage tracking
Order-of-magnitude increases in scale, dynamics (elasticity), complexity (automation, virtualization)

Use Cases from NIST

Domain	Security/Privacy Concerns
Retail / Marketing	Consumer data via web analytics, MAC address tracking, IP logging — individual data collected by multiple means
Healthcare	Health information exchange, differential privacy, genetic privacy, pharma clinical trial sharing, patient-level disclosure
Cybersecurity	Network protection data collection, data governance, encryption/key management, tenant isolation/containerization
Government	UAV sensor data (military + civilian), education performance reporting scored by private firms

Mobile Devices & Big Data

Mobility is a critical BD element
BYOD challenges governance and enterprise controls
Web/desktop apps migrated to mobile may lack adequate security
Less physical security, yet full access to BD systems
Geospatial data from mobile devices can enrich datasets and enable deanonymization

☁️ Cloud Computing & Security

LECTURE 03 — CLOUD SECURITY (L07)

Cloud Computing Elements

Cloud Networking

Network and network-management capabilities required to access cloud services. This can include Internet access, dedicated private connectivity between subscriber and provider, and security enforcement using firewalls and related controls.

Cloud Storage

A subset of cloud computing in which database storage and related applications are hosted remotely. Its value comes from scalability and relief from buying, maintaining, and managing local storage assets.

Cloud Service Models

The key exam comparison is how responsibilities shift across SaaS, PaaS, and IaaS.

Cloud Service Models

Shared Responsibility Service Models — Responsibility Stack

As you move from traditional infrastructure to SaaS, more of the stack is operated by the provider. Data protection remains the customer's responsibility in every model.

SaaS

Provider manages everything. Customer only uses the application. App software provided by cloud, visible to subscriber.

PaaS

Customer develops & deploys applications. Platform managed by provider. App software developed by subscriber, platform visible to subscriber.

IaaS

Customer controls OS, storage, apps. Provider manages underlying infrastructure. Maximum flexibility, most responsibility.

Other Cloud Services

Service	Description
CaaS (Communications)	Video conferencing, web conferencing, IM, VoIP integration
CompaaS (Compute)	Processing resources — simplified IaaS focused on compute capacity
DSaaS (Data Storage)	Data storage provision via Internet, accessed through provider software
NaaS (Network)	VPN, bandwidth on demand, custom routing, firewalls, IDS/IPS, WAN, content filtering
XaaS (Anything/Everything)	Umbrella term for any service delivered via cloud. Benefits: lower costs, lower risk, faster innovation

Cloud Deployment Models

Public Cloud

Available to general public. Provider owns infrastructure. Multi-tenant. Outside enterprise firewall. Advantage: cost. Concern: security. Lower SLAs typically.

Private Cloud

Within organization's internal IT. Can be managed in-house or by third party. On-premises or off-premises. Key motivation: security. Examples: DB on demand, email on demand.

Community Cloud

Shared among orgs with similar requirements. Restricted access like private, but shared resources like public. Example: healthcare industry. Can comply with government regulations.

Hybrid Cloud

Composition of 2+ clouds (private/community/public). Bound by standardized technology. Sensitive data in private area, less sensitive in public. Attractive for smaller businesses.

NIST Cloud Computing Reference Architecture — 5 Actors

NIST SP 500-292 NIST Cloud Reference Architecture — Actors

The provider is the operating core. Consumers use services, brokers mediate or aggregate them, carriers provide network transport, and auditors independently assess the environment.

Actor	Role
Cloud Consumer	Person/org that uses cloud services
Cloud Provider	Entity responsible for making services available (service orchestration, management, physical resources)
Cloud Auditor	Conducts independent assessment — security audits, privacy impact audits, performance audits
Cloud Broker	Manages use, performance, and delivery: service intermediation, aggregation, and arbitrage
Cloud Carrier	Provides connectivity and transport between consumer and provider

If the essay is on cloud computing A strong answer order is: 1) define cloud and mention networking/storage elements, 2) compare SaaS/PaaS/IaaS responsibilities, 3) explain deployment models, 4) present the 5 NIST actors, and 5) move to the top cloud threats plus key countermeasures.

🛡️ Cloud Security Threats & Countermeasures

LECTURE 03 — CLOUD SECURITY (L07)

Core challenge: The enterprise loses substantial control over resources, services, and applications but must maintain accountability for security and privacy policies.

The Cloud Security Alliance identified 7 top cloud-specific security threats:

1. Abuse & Nefarious Use

Easy registration → spamming, malicious code, DoS launched from cloud.

Countermeasures: Stricter registration/validation, enhanced credit card fraud monitoring, comprehensive traffic inspection, monitoring public blacklists.

2. Insecure Interfaces & APIs

Cloud services rely on APIs. Security depends on these interfaces.

Countermeasures: Analyze CP security model, strong authentication + encrypted transmission, understand API dependency chains.

3. Malicious Insiders

Unprecedented trust given to CP. High-risk roles: system admins, managed security providers.

Countermeasures: Strict supply chain management, HR requirements in contracts, transparency in security practices, breach notification processes.

4. Shared Technology Issues

Isolated VMs still vulnerable. Multi-tenancy introduces shared risk.

Countermeasures: Security best practices for installation/config, monitoring, strong auth for admin access, SLA enforcement for patching, vulnerability scanning.

5. Data Loss or Leakage

Data must be secured at rest, in transit, and in use.

Countermeasures: Strong API access control, two models (multi-instance: unique DBMS per subscriber; multi-tenant: shared env with tagging), encrypt data (ideally CP has no access to keys), strong key management.

6. Account / Service Hijacking

Stolen credentials → access to critical cloud services.

Countermeasures: No sharing of credentials, strong two-factor authentication, proactive monitoring, understand CP security policies/SLAs.

7. Unknown Risk Profile

Client must define roles/responsibilities for risk management. Shadow IT risk (unapproved deployments).

Countermeasures: Disclosure of logs/data, partial/full infrastructure disclosure (patch levels, firewalls), monitoring and alerting.

Security as a Service (SecaaS)

Cloud-based security services include:

Encryption

Cloud-provided encryption services

E-mail Security

Anti-spam, anti-phishing, anti-malware

Identity & Access Mgmt

IAM as a cloud service

Web Security

Web application firewalls, URL filtering

Intrusion Management

Cloud-based IDS/IPS

Data Loss Prevention

DLP monitoring and enforcement

Security Assessments

Vulnerability scanning, penetration testing

BCDR

Business continuity & disaster recovery

Network Security

Firewalls, network monitoring

SIEM

Security information & event management

📡 Internet of Things — Architecture & Components

LECTURE 04 — IoT SECURITY (L08)

Key Definitions (ITU-T)

Term	Definition
IoT	A global infrastructure for the information society, enabling advanced services by interconnecting physical and virtual things based on interoperable ICT
Thing	An object (physical or virtual) capable of being identified and integrated into communication networks
Device	Equipment with mandatory communication capability + optional sensing, actuation, data capture/storage/processing

Components of IoT-Enabled Things

Sensors

Detect environmental parameters (temperature, pressure, motion, etc.)

Actuators

Perform physical actions based on commands (motors, switches, valves)

Microcontroller

Embedded computing capability — process sensor data, control actuators

Transceiver

Communication means — essential ingredient, enables network participation

RFID

Radio-frequency identification for tracking objects, animals, humans. Tags + readers.

Deeply Embedded Systems

A subset of embedded systems using a microcontroller (not microprocessor), not programmable after ROM burn, no user interaction. Dedicated single-purpose devices that detect, process, and act. IoT depends heavily on them. Have extreme resource constraints: memory, processor, time, power.

IoT Reference Models

ITU-T Y.2060 Reference Model (4 Layers)

ITU-T Y.2060 IoT Reference Model (4 Layers)

Management and security are cross-cutting capabilities in Y.2060. They apply across every core layer rather than sitting in only one layer.

IoT World Forum Reference Model (7 Layers)

IoT World Forum 7-Layer Reference Model

Layers 1 to 3 live at the operational edge, while layers 4 to 7 live in the center. Layer 3 is the fog-computing transition between real-time control and central analytics.

Key distinction: Layers 1-3 are at the Edge (OT, event-based, real-time, data-in-motion). Layers 4-7 are at the Center (IT, query-based, non-real-time, data-at-rest). Layer 3 (Fog Computing) is the critical transition point.

Fog Computing

Distributed intelligence between the edge devices and the cloud data center. Four tiers:

Tier	Connectivity	Scale	Response
Smart Things Network	Bluetooth, WiFi, Wired	Millions of devices	Millisecond
Fog Network	3G/4G/LTE/Wi-Fi	Tens of thousands	Real-time
Core Network	IP/MPLS	Thousands	QoS/QoE driven
Data Center / Cloud	Ethernet	Hundreds	Transactional

🔒 IoT Security

LECTURE 04 — IoT SECURITY (L08)

IoT Security Elements of Interest

The IoT security landscape includes four types of elements connected to the Internet or enterprise network:

A — Application / Management / Storage Platform

Servers and cloud platforms that manage, store, and process IoT data. Shaded = includes security features.

G — Gateway

Bridges constrained IoT devices with the enterprise/Internet. Protocol translation, data aggregation, local processing.

U — Unconstrained Device

Full-capability devices that can support standard security protocols (TLS, etc.).

C — Constrained Device

Extremely limited resources (memory, processor, power). Cannot support heavy security protocols.

IoT Security Topology Security Elements of Interest

The gateway is the critical security boundary for constrained devices. Unconstrained devices can often participate more directly in standard secure protocols, while constrained devices typically rely on gateway mediation.

ITU-T Y.2066 — IoT Security Requirements

Functional requirements for capturing, storing, transferring, aggregating, and processing IoT data:

Requirement	Scope
Communication Security	Secure data transport between devices, gateways, and cloud
Data Management Security	Protect data integrity and confidentiality during lifecycle
Service Provision Security	Ensure services delivered securely to end users
Integration of Security Policies	Harmonize security across heterogeneous devices and systems
Mutual Authentication & Authorization	Devices and services authenticate each other before interaction
Security Audit	Logging and auditing of security-relevant events

IoT Gateway Security Functions

The IoT gateway serves as a critical security boundary, providing: protocol translation security, device authentication, data filtering and validation, local security policy enforcement, secure communication uplink to cloud, and firmware update management.

CISCO Secure IoT Framework

CISCO's IoT security environment addresses threats across the entire IoT deployment: physical security of devices, network security (segmentation, encryption), application security, data security, and identity management — with security monitoring and response spanning all layers.

NIST on IoT & Big Data (from NIST SP 1500-4r2): Until IoT hardware matures sufficiently to support TLS and cryptographic authentication, IoT data will typically be collected under a single provider per device type. IoT aggregate Data Providers should authenticate individual IoT device connections prior to accepting data. Veracity is strongly dependent on hardware and protocol implementation details.

If the essay is on IoT security A strong answer order is: 1) define IoT, thing, and device, 2) present IoT-enabled components, 3) explain the 4-layer Y.2060 model and the 7-layer IoT World Forum model, 4) define fog computing as the edge-to-center transition, and 5) finish with Y.2066 security requirements, constrained devices, and gateway security functions.

📝 Practice Questions

EXAM-MATCHED DRILL — 15 FUNDAMENTALS MCQs + CLOUD/IoT ESSAY PREP

Part A — 15 MCQs from Lecture 01 (30 Marks)

Q01

Which of the following is NOT one of the CIA triad?

Confidentiality
Integrity
Authentication
Availability

C) Authentication — The CIA triad consists of Confidentiality, Integrity, and Availability. Authentication is a related but separate security concept.

Q02

A leaky vulnerability primarily corresponds to loss of:

Availability
Integrity
Confidentiality
Authenticity

C) Confidentiality — A leaky system exposes information improperly. The lecture classifies vulnerabilities as corrupted (integrity), leaky (confidentiality), and unavailable/slow (availability).

Q03

Traffic analysis is classified as which type of attack?

Active attack
Passive attack
Insider attack
Denial of Service

B) Passive attack — Traffic analysis observes communication patterns without altering the system resources or message content itself.

Q04

Which of the following is NOT one of the four active attack categories listed in the lecture?

Masquerade
Replay
Traffic analysis
Modification of messages

C) Traffic analysis — Traffic analysis is passive. The four active categories are masquerade, replay, message modification, and denial of service.

Q05

Which of the following is NOT one of the four complementary courses of action in security implementation?

Prevention
Detection
Evaluation
Recovery

C) Evaluation — The lecture’s four courses of action are prevention, detection, response, and recovery. Evaluation is a separate concept: examining a system against criteria.

Q06

In the lecture, assurance means:

Encrypting all stored data
The degree of confidence that safeguards work as intended
Formal evaluation by a government agency only
Recovery after an incident

B) The degree of confidence that safeguards work as intended — Assurance is confidence in the effectiveness of technical and operational security measures.

Q07

Which malware category is an independent, self-contained program that replicates?

Virus
Worm
Trojan
Macro

B) Worm — Worms are independent programs and they replicate. Viruses replicate too, but they are parasitic and require a host program.

Q08

Slowloris is best described as:

A replay attack against encrypted packets
An HTTP DoS attack using requests that never complete
A phishing attack against cloud accounts
A botnet command-and-control protocol

B) An HTTP DoS attack using requests that never complete — The lecture explicitly lists Slowloris under HTTP flooding and notes that the HTTP requests never complete.

Q09

In a digital envelope, the symmetric key used to encrypt the message is itself encrypted with:

The sender's private key
The sender's public key
The recipient's public key
A shared secret key

C) The recipient's public key — The message is encrypted with a random symmetric key, and that key is then encrypted with the recipient’s public key.

Q10

Why does a MAC (Message Authentication Code) not provide non-repudiation?

Because MACs cannot verify integrity
Because MACs do not use hashing
Because the secret key is shared by sender and receiver
Because MACs operate only at the network layer

C) Because the secret key is shared by sender and receiver — Since both parties know the same key, either of them could have generated the MAC.

Q11

Remote user authentication is generally based on challenge-response protocols mainly to counter:

Only physical theft of servers
Eavesdropping, password capture, and replay
Only malware infection
Only insider attacks

B) Eavesdropping, password capture, and replay — That is the exact threat set highlighted by the lecture for remote authentication over a network.

Q12

Typing rhythm is an example of which authentication factor?

Something you know
Something you have
Something you are
Something you do

D) Something you do — Typing rhythm is a dynamic biometric, grouped under “something you do.”

Q13

Which access control model compares security labels with security clearances?

DAC
MAC
RBAC
ABAC

B) MAC — Mandatory Access Control is based on system-enforced security labels and clearances, not owner discretion.

Q14

Which protocol adds security enhancements to the MIME e-mail format so messages can be signed and/or encrypted?

HTTPS
S/MIME
IPSec
SSH

B) S/MIME — Secure/Multipurpose Internet Mail Extension extends MIME with signing and encryption support.

Q15

Which statement best describes IPSec as presented in the lecture?

It secures e-mail attachments at the application layer only
It operates at the network layer and provides authentication, confidentiality, and key management
It is mainly used to replace RBAC in databases
It is the protocol used for challenge-response passwords

B) It operates at the network layer and provides authentication, confidentiality, and key management — The lecture also lists secure branch connectivity, remote access, intranets/extranets, and e-commerce enhancement as common applications.

Part B — Essay Prep for Cloud + IoT (20 Marks)

E01

Write a 20-mark answer comparing SaaS, PaaS, and IaaS from a security-responsibility perspective.

Start with the shared-responsibility idea. Then explain that in SaaS the provider manages nearly everything and the customer mainly protects usage and data; in PaaS the customer manages applications and data while the provider manages runtime, middleware, OS, virtualization, servers, storage, and networking; in IaaS the customer manages the OS upward and therefore carries the greatest security burden. Close by stating the tradeoff: more control means more responsibility.

E02

Explain the 5 actors in the NIST Cloud Computing Reference Architecture and why they matter in security discussions.

Define each actor briefly: Cloud Consumer uses services, Cloud Provider offers and operates them, Cloud Auditor performs independent assessment, Cloud Broker manages or aggregates delivery and use, and Cloud Carrier provides connectivity and transport. Then explain the security value of the model: it clarifies accountability boundaries, where controls are applied, and who is responsible for audits, transport, and service composition.

E03

Discuss cloud security risks and countermeasures using the lecture structure.

Open with the core challenge: the enterprise loses control but must retain accountability. Then name the top threats from the lecture: abuse and nefarious use, insecure interfaces/APIs, malicious insiders, shared technology issues, data loss/leakage, account hijacking, and unknown risk profile. For a strong answer, expand at least three of them with their countermeasures such as stronger registration, strong authentication with encrypted transmission, supplier and contract controls, patching and vulnerability scanning, key management, two-factor authentication, logging, and monitoring.

E04

Explain the IoT reference models and the role of fog computing.

Define the ITU-T Y.2060 4-layer model: device layer, network layer, service support & application support, and application layer, with management and security as cross-cutting capabilities. Then explain the IoT World Forum 7-layer model and emphasize that layers 1-3 are edge-side while layers 4-7 are center-side. Finish with fog computing as the distributed-intelligence transition between edge devices and the cloud, especially important for real-time analysis and transformation close to the source.

E05

Why are constrained IoT devices difficult to secure, and why is the gateway so important?

Constrained devices have severe limits in memory, processing power, timing, and energy, so they often cannot run heavyweight cryptographic or TLS-based protections. The gateway therefore becomes the security boundary: it performs protocol translation, device authentication, data filtering and validation, local policy enforcement, secure uplink communication, and update management. A strong answer should also mention that aggregate providers should authenticate individual IoT device connections before accepting data.

E06

Summarize the IoT security requirements from ITU-T Y.2066 and connect them to a real deployment.

List the requirements directly: communication security, data management security, service provision security, integration of security policies and techniques, mutual authentication and authorization, and security audit. Then explain their practical meaning in a deployment: secure links between device-gateway-cloud, protected data across its lifecycle, secure services to users, harmonized security across heterogeneous components, verified identities before interaction, and logging/auditing of security-relevant events.

HIAST — Master in Big Data Systems — Security & Privacy

Study Reference • Based on all 4 lectures + NIST SP 1500-4r2 • بالتوفيق يا مو 🚀

Security & Privacy in Big Data Systems

TABLE OF CONTENTS

🎯 Exam Focus — First 50 Marks

Highest Priority

Second Priority

Lower Priority for First 50

Fast Study Order

🔐 Security Fundamentals — The CIA Triad

Confidentiality

Integrity

Availability

Vulnerabilities, Threats & Attacks

Attack Classification

Countermeasures

Computer Security Strategy

Security Policy

Security Implementation

Assurance

Evaluation

⚔️ Attacks: Passive, Active & Malware

Passive Attacks

Release of Message Contents

Traffic Analysis

Active Attacks

Masquerade

Replay

Modification of Messages

Denial of Service (DoS)

Denial of Service (DoS) — Deep Dive

Malware Classification

🔑 Cryptographic Tools

What Each Crypto Tool Answers

Confidentiality

Integrity

Authentication

Non-Repudiation

1. Symmetric Encryption

1. Plaintext

2. Encrypt with shared key K

3. Ciphertext

4. Decrypt with the same key K

Block Cipher

Stream Cipher

Best Way to Remember It

2. Public-Key (Asymmetric) Encryption

Public Key for Confidentiality

Private Key for Signing

3. Digital Envelope (Hybrid Encryption)

1. Generate session key Ks

2. Encrypt the message with Ks

3. Encrypt Ks with the receiver's public key

4. Receiver recovers Ks, then decrypts the message

4. Hashes, MACs, and Digital Signatures

Message Authentication Code (MAC)

Digital Signature (Hash-based)

5. Public-Key Certificates

1. Alice has a public key

2. The CA signs the binding

3. Bob verifies with the CA public key

🪪 User Authentication & Access Control

Means of Authentication (4 Factors)

Access Control Policies

DAC — Discretionary

MAC — Mandatory

RBAC — Role-Based

ABAC — Attribute-Based

🌐 Internet Security Protocols

S/MIME — Secure Email

SSL/TLS

HTTPS

IPSec (IP Security)

📊 NIST Big Data Interoperability Framework

What is Different About Big Data Security & Privacy?

1. Heterogeneous Components

2. Streamed + At-Rest Data

3. Multi-Source Privacy Risk

4. IoT Sensor Explosion

5. Commodity Big Data Sources

6. Veracity & Jurisdiction Magnified

7. Data Permanence (Volatility)