Document
☰
Engineering Labs
Home
Categories ▾
⚙️ Mechanical
💻 Computer Science
⚡ EEE
🏗️ Civil
Blogs
About Us
Privacy Policy
Contact us
Follow Us ▾
📘 Facebook
▶️YouTube
📢 Telegram Channel
💬 Telegram Group
🔖 Bookmarks
⬆ Upgrade
🧩 More Apps
|
🔗 Share
✉ Send Feedback
⭐ Rate App
←
✕
Engineering Labs
v2.2.1
📌 Bookmarks
👑 Upgrade
📂 Categories
📱 More Apps
Information
ℹ️ About Us
📞 Contact Us
🔒 Privacy Policy
📝 Blogs
Communicate
🔗 Share
✉️ Send Feedback
⭐ Rate App
Social
📘 Facebook
▶️ YouTube
📢 Telegram Channel
💬 Telegram Group
←
Multi - Core Architectures and Programming
Practical project ideas and implementations for hands-on learning
Chapter 1 : Hardware and Processes and Threads
Ensuring the Correct Order of Memory Operations
Examining the Insides of a Computer
Hardware, Processes, and Threads
How Latency and Bandwidth Impact Performance
Increasing Instruction Issue Rate with Pipelined Processor Cores
Supporting Multiple Threads on a Single Chip
The Characteristics of Multiprocessor Systems
The Differences Between Processes and Threads
The Motivation for Multicore Processors
The Performance of 32-Bit versus 64-Bit Code
The Translation of Source Code to Assembly Language
Translating from Virtual Addresses to Physical Addresses
Using Caches to Hold Recently Used Data
Using Virtual Memory to Store Data
Chapter 10 : Other Parallelization Technologies
Alternative Languages
Clustering Technologies
GPU-Based Computing
Language Extensions
Transactional Memory
Vectorization
Chapter 2 : Coding for Performance
Coding for Performance
Commonly Available Profiling Tools
Defining Performance
How Cross-File Optimization Improves Performance
How Not to Optimize
How Structure Impacts Performance
Identifying Where Time Is Spent Using Profiling
Performance and Convenience Trade-Offs in Source Code and Build Structures
Performance by Design
Pointer Aliasing and Compiler Optimization Issues
Selecting Appropriate Compiler Options
The Impact of Data Structures on Performance
The Role of the Compiler
The Two Types of Compiler Optimization
Understanding Algorithmic Complexity
Using Algorithmic Complexity with Care
Using Libraries to Structure Applications
Using Profile Feedback
Why Algorithmic Complexity Is Important
Chapter 3 : Identifying Opportunities for Parallelism
Amdahl’s Law
Anti-dependencies and Output Dependencies
Client–Server Division of Work
Combining Parallelization Strategies
Critical Paths
Data Parallelism Using SIMD Instructions
Determining Maximum Practical Threads
Hosting Multiple Operating Systems Using Hypervisors
How Dependencies Affect Parallel Execution
How Parallelism Changes Algorithm Choice
How Synchronization Costs Reduce Scaling
Identifying Opportunities for Parallelism
Identifying Parallelization Opportunities
Improving Machine Efficiency Through Consolidation
Multiple Copies of the Same Task
Multiple Independent Tasks
Multiple Loosely Coupled Tasks
Multiple Users Utilizing a Single System
Parallelization Patterns
Parallelization Using Processes or Threads
Pipeline of Tasks for One Item
Producer–Consumer Splitting
Single Task Split Over Multiple Threads
Using Containers to Isolate Applications
Using Multiple Processes to Improve System Productivity
Using Parallelism to Improve Performance of a Single Task
Using Speculation to Break Dependencies
Visualizing Parallel Applications
Chapter 4 : Synchronization and Data Sharing
Atomic Operations and Lock-Free Code
Avoiding Data Races
Barriers
Communication Between Threads and Processes
Data Races
Deadlocks and Livelocks
Mutexes and Critical Regions
Readers–Writer Locks
Semaphores
Spin Locks
Storing Thread-Private Data
Synchronization and Data Sharing
Synchronization Primitives
Tools for Detecting Data Races
Chapter 5 : Using POSIX Threads
Compiling Multithreaded Code
Creating Threads
Multiprocess Programming
Process Termination
Reentrant Code and Compiler Flags
Sharing Data Between Threads
Sockets
Using POSIX Threads
Variables and Memory
Windows Threading
Chapter 6 : Windows Threading
Allocating Thread-Local Storage
Atomic Updates of Variables
Communicating Using Sockets
Communicating with Pipes
Creating and Resuming Suspended Threads
Creating Native Windows Threads
Creating Processes
Example: Requiring Synchronization Between Threads
Inheriting Handles in Child Processes
Naming Mutexes and Sharing Them
Protecting Code with Critical Sections
Protecting Regions with Mutexes
Setting Thread Priority
Sharing Memory Between Processes
Signaling Event Completion
Slim Reader/Writer Locks
Synchronization and Resource Sharing Methods
Terminating Threads
Using Handles to Kernel Resources
Wide String Handling
Chapter 7 : Using Automatic Parallelization and OpenMP
Accessing Private Data Outside Parallel Regions
Assisting the Compiler with Automatic Parallelization
Collapsing Loops to Improve Balance
Controlling the OpenMP Runtime Environment
Dynamic Parallel Tasks in OpenMP
Enforcing Memory Consistency
Ensuring In-Order Execution in Parallel Regions
Identifying and Parallelizing Reductions
Improving Scheduling and Work Distribution
Keeping Data Private to Threads
Nested Parallelism
Parallel Sections for Independent Work
Parallelization Example
Parallelizing Code Containing Calls
Parallelizing Reductions Using OpenMP
Producing a Parallel Application Automatically
Restricting Threads That Execute a Region
Runtime Behavior of OpenMP Applications
Using Automatic Parallelization and OpenMP
Using OpenMP to Create Parallel Applications
Using OpenMP to Parallelize Loops
Variable Scoping in OpenMP Regions
Waiting for Work Completion
Chapter 8 : Hand Coded Synchronization and Sharing
Atomic Operations
Compare-and-Swap for Complex Atomics
Compiler Memory-Ordering Directives
Dekker’s Algorithm
Enforcing Memory Ordering
Hand-Coded Synchronization and Sharing
Lockless Algorithms
Modifying Code to Use Atomics
OS-Provided Atomics
Producer–Consumer with Circular Buffer
Reordering by the Compiler
Scaling Consumers or Producers
Scaling Producer–Consumer to Multiple Threads
The ABA Problem
Volatile Variables
Chapter 9 : Scaling with Multicore Processors
Bandwidth Sharing Between Cores
Cache Conflict and Capacity Issues
Constraints to Application Scaling
False Sharing
Hardware Constraints to Scaling
Multicore Processors and Scaling
OS Constraints to Scaling
Pipeline Resource Starvation
Scaling with Multicore Processors