Gedankenprotokoll Erstklausur PHC

WiSe2021/22
Anonym
05.04.2022


Contents

  1. Frage1 (7 points)
  2. Fragen 2 (7 points)
  3. Weak and Strong Scaling (3 points)
  4. Hypothetisches System mit sequentieller Consistency (5 points)
  5. False Sharing (5 points)
  6. OpenMP (10 points)
  7. Shared or Private in OpenMP (6 points)
  8. MPI Tasks (6 points)
  9. Questions to MPI (13 points)
  10. Aufgabe 10 (8 points)
  11. Aufgabe 11 (6 points)
  12. Aufgabe 12 (4 points)
  13. Aufgabe 13 (6 points)

Frage1 (7 points)

a) Was bedeutet MPI? (1 point)

b) Was bedeutet VLIW? (1 point)

c) What is the arithmetic intensity of the app (2 points)

d) Nenne 2 Beispiele für IPL (Instruction Level Parallelism) (2 points)

e) Typical Size of cache line (1 point)


Fragen 2 (7 points)

a) Nenne 1 MPI collective function, die eine Operation verwendet (1 point)

b) Was sind sparse Matrixen und wo kommen sie vor? (2 points)

c) Speedup, Efficiency, Amdahls Law

Ein Programm besteht aus einem sequentiellen Part, der 40% und einem parallelen Part, der 60% (in der Ausführung mit 1 Prozessor) ausmacht.

  • i) Was ist der Speedup mit 6 Prozessoren? (1 point)
  • ii) Was ist die Efficiency mit 6 Prozessoren? (1 point)
  • iii) Was ist der maximale Speedup, wenn die Anzahl der Prozessoren ∞ wäre? (1 point)
  • iv) Welches Law beschreibt die Speedup-Obergrenze? (1 point)

Weak and Strong Scaling (3 points)

a) Welche Art von Scaling ist im Bild 1? (1 point)

b) Welche Art von Scaling ist im Bild 2? (1 point)

c) Im Bild 2 weichen die Linien ab einem bestimmten Punkt von der idealisierten Linie ab – was kann der Grund sein? (1 point)

Bild 1 (Listing 1):

execution time
^
|
| _____ --- _____ -- --- -- --- -- --- -- --- -- --- -- ---
| ---- _____ - -- _____ - -- --- -- --- -- _________________
| __ - -- -- _____ - -- _____ --- -- --- -- --- -- --- -- ---
|
| ______________________________ > number of cores

Bild 2 (Listing 2):

execution time
^
| a b idealline for b c
| \ \ \\ \
| \ \ \\ \
| \ \ \\ \
| \ \ \\ \
| \ \ \\ \
| \ \ \\ \
| \ \ \\ \
| \ \ \\ \
| \ \ __ \\ __ \
| \ \\ \ ________ |
| \ \\ b \
| a ifb c
| ______________________________ > number of cores

Hypothetisches System mit sequentieller Consistency (5 points)

// Pseudo code
int flag = 0;
int val1 = 0;
int val2 = 0;
 
/* ThreadA */
{
    val1 = 42;
    val2 = 11;
    flag = 1;
    val2 = 22;
}
 
/* ThreadB */
{
    val2 = 33;
    if (flag == 1) {
        print(val1);
        print(val2);
    }
}
 
/* ThreadC */
{
    if (flag == 1) {
        val2 = 44;
    }
}

Gegeben dieser Pseudocode: Welche möglichen (Plural!) Print outputs kann ThreadB erzeugen? (Auch zulässig ist „no output“, falls nichts ausgegeben wird.)

nrval1val2
0
1
2
3
4
7

False Sharing (5 points)

a) What is False Sharing? (3 points)

b) 2 Software solutions for False Sharing (2 points)


OpenMP (10 points)

a) 2 OpenMP worksharing constructs (2 points)

b) 2 approaches for Mutual Exclusion in OpenMP. Name 2 examples (4 points)

c) OpenMP clauses (4 points)

  • i) What are OpenMP clauses for the for qualifier?
  • ii) Nenne 2 mögliche clauses
  • iii) Erkläre die clauses
  • iv) Gebe Beispiele, wann welcher clause dem anderen vorzuziehen ist und andersherum

Shared or Private in OpenMP (6 points)

Beispiel-Code:

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
#define N 100
 
int main (int argc, char **argv) {
    int a = 10;
    int B[100], C[100], D[100];
    // init: initialize array elements
    B = init(B);
    C = init(C);
    D = init(D);
 
    #pragma omp parallel private(a)
    {
        #pragma omp for private(B)
        for (int i = 0; i < N; i++) {
            // Visibility at this place
            some_call(a);
            some_other_call(i);
            yet_another_call(B);
        }
    }
    return 0;
}

a) Shared or Private: i (1 point)

b) Shared or Private: a (1 point)

c) Shared or Private: C (1 point)

d) Shared or Private: B (1 point)

e) Value a is not initialized, how can the code be changed so a has value 10? (2 points)


MPI Tasks (6 points)

A dependency graph of an application is given:

        f()
       /  \
   g0()   g1()   g2()
       \  /
         h()

Create a correct OpenMP program with Tasks (and no other synchronization tools) which executes the given functions f(), g0(), g1(), g2(), h() in the specified order while respecting the data dependencies shown in the graph.


Questions to MPI (13 points)

a) 1 Example for two-sided, 1 example for one-sided and 1 example for MPI-collective function (point)

b) 4 MPI communication modes for two-sided point-to-point communications (1 point)

c) What are MPI communicators? (1 point)

d) What are MPI tags? (1 point)

e) What are MPI window objects? (1 point)

f) What is used in MPI two-sided point-to-point communication? (1 point)

Options: Communicators, Tags, Window objects, None, All

g) What is used in MPI one-sided communication? (1 point)

Options: Communicators, Tags, Window objects, None, All

h) What is used in MPI collective communication? (1 point)

Options: Communicators, Tags, Window objects, None, All


Aufgabe 10 (8 points)

a) Which statement is true with respect to deadlocks for scenario 1. (1 point)

Options: No deadlock, deadlock guaranteed, or deadlock possible

b) Which statement is true with respect to deadlocks for scenario 2. (1 point)

Options: No deadlock, deadlock guaranteed, or deadlock possible

c) Which statement is true with respect to deadlocks for scenario 3. (1 point)

Options: No deadlock, deadlock guaranteed, or deadlock possible

d) Which statement is true with respect to deadlocks for scenario 4. (1 point)

Options: No deadlock, deadlock guaranteed, or deadlock possible

e) Is scenario 5 a valid MPI program? List changes to make it valid. (1 point)

f) Is scenario 6 a valid MPI program? List changes to make it valid. (1 point)

g) Is scenario 7 a valid MPI program? List changes to make it valid. (1 point)

h) Is scenario 8 a valid MPI program? List changes to make it valid. (1 point)


Aufgabe 11 (6 points)

a) Describe the hierarchical hardware of GPUs. (2 points)

b) How is hierarchical hardware organization reflected in CUDA? (2 points)

c) How is hierarchical hardware organization reflected in OpenMP? (2 points)


Aufgabe 12 (4 points)

a) Which can run a single MPI application with all resources? (1 point)

b) Which can run a single OpenMP application with all resources? (1 point)

c) Which can run a single PGAS application with all resources? (1 point)

d) Which can run Numa influence the performance? (1 point)


Aufgabe 13 (6 points)

a) Part 1

  • i) What is the diameter of one PC? (1 point)
  • ii) What is the shortest route from (0,0,1,1,0,1,0) to (0,0,1,1,1,0,1)? (1 point)
  • iii) With an optimal broadcast algorithm, what are the minimum steps required to send data from (0, …, 0) to all other nodes? (1 point)

b) When combining both PCs to form an 8-dimensional hypercube:

  • i) How many additional nodes does one need? (1 point)
  • ii) How many additional links does one need? (1 point)
  • iii) How can the existing grey code scheme be extended? (1 point)

Exam 1
Datum: 21.02.2022
Dauer: 90 minutes
Gesamtpunkte: 86 Points Total

×

MyUniNotes is a free, non-profit project to make education accessible for everyone. If it has helped you, consider giving back! Even a small donation makes a difference.

These are my personal notes. While I strive for accuracy, I’m still a student myself. Thanks for being part of this journey!