Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer · Schmallenberg Dortmund Potsdam Berlin...

Post on 06-Oct-2020

1 views 0 download

transcript

0

FhGFS – Performance at the maximum

Christian Mohrbacher christian.mohrbacher@itwm.fraunhofer.de

1

Introduction

Overview on FhGFS

Benchmarks

2

The Fraunhofer Gesellschaft (FhG)

Fraunhofer is based in Germany

Largest organization for applied research in Europe

Annual research volume of 1.6 billion euros

17,000 employees

~ 60 Fraunhofer institutes with different

business fields

München

Holzkirchen

Freiburg

Efringen-

Kirchen

Freising Stuttgart

Pfinztal Karlsruhe Saarbrücken

St. Ingbert Kaiserslautern

Darmstadt Würzburg

Erlangen

Nürnberg

Ilmenau

Schkopau

Teltow

Oberhausen

Duisburg

Euskirchen Aachen St. Augustin Schmallenberg

Dortmund

Potsdam Berlin

Rostock Lübeck

Itzehoe

Braunschweig

Hannover

Bremen

Bremerhaven

Jena

Leipzig

Chemnitz

Dresden

Cottbus Magdeburg

Hall

e

Fürth

Wachtberg

Ettlingen

Kandern

Oldenburg

Freiberg

Paderborn

Kassel

Gießen Erfurt

Augsburg

Oberpfaffenhofen

Garching

Straubing

Bayreuth

Bronnbach

Prien

3

The Fraunhofer ITWM

Institute for Industrial Mathematics

Located in Kaiserslautern, Germany

Staff: ~ 150 employees + ~ 70 PhD students

4

ITWM’s Competence Center HPC

FhGFS Photorealistic RT

rendering Interactive

seismic imaging

Green IT Smart Grids

Programming models / tools

Research

5

Introduction

Overview on FhGFS

Benchmarks

6

FhGFS - Overview

Maximum Scalability

Flexibility Easy to use

Free to use Support by

Fraunhofer

http://www.fhgfs.com

7

FhGFS – Key concepts (1)

Maximum Scalability

Distributed file contents & metadata

Initially optimized especially for HPC

Native Infiniband / RDMA

8

FhGFS - Key concepts (2)

Flexibility

Add clients and servers without downtime

Multiple servers on the same machine

Client and servers can run on the same machine

Servers run on top of local FS

On-the-fly storage init => suitable for temporay “per-job” PFS

Flexible striping (per-file/per-directory)

Multiple networks with dynamic failover

9

FhGFS - Key concepts (3)

Easy to use

Servers: userspace

Client: Kernel module w/o kernel patches

Graphical system administration & monitoring

Simple setup/startup mechanism

No specific Linux distribution

No special hardware requirements

10

Partners / Vendors

11

Customers (Examples)

2 Servers 2 Clients

8 TB 800 MB/s

12 Servers 900 Clients

1PB 20 GB/s

12 Servers 1200 Clients

300 TB 6 GB/s

5 Servers 100 Clients

200 TB 5 GB/s

12

Current development

Integrated High Availability

No shared storage needed

Flexible mirroring

RAID10 available in 2012.10-beta1

Internal speed improvements

e.g. metadata format (available in 2012.10-beta1)

HSM integration

Grau Data and Fraunhofer collaborate

Providing a fast archiving solution

Built-in benchmarking tools (available in 2012.10-beta1)

Quotas

13

Introduction

Overview on FhGFS

Benchmarks

14

File Statistics

Dice PFS comparison project surveyed HPC data center representatives to find the most important metrics 1)

Multi-stream performance

Large block I/O

Metadata performance

File size statistics by Johannes Gutenberg University Mainz 2)

Large files are common (>100 GB)

Very small files (<=4k) are the most common

90% of files => 10% disk capacity

1) PFS Survey Report; http://www.avetec.org/appliedcomputing/dice/projects/pfs/docs/PFS_Survey_Report_Mar2011.pdf 2) A Study on Data Deduplication in HPC Storage Systems; Dirk Meister et al.; Johannes Gutenberg Universität; SC12

15

Benchmarks – server hardware

20 servers for metadata and storage

2x Intel Xeon X5660 @ 2.8 GHz

48 GB RAM

4x Intel 510 Series SSD (RAID 0), Ext4

QDR Infiniband

Scientific Linux 6.3; Kernel 2.6.32-279

FhGFS 2012.10-beta1

16

Streaming Throughput

0

5000

10000

15000

20000

25000

0 2 4 6 8 10 12 14 16 18 20

MB

/s

# Storage servers

Sequential Read/Write,

up to 20 servers, 160 client procs

Write

Read

17

Streaming Throughput (2)

Single node local performance

Write: 1332 MB/s

Read: 1317 MB/s

20 nodes (theoretical)

Write: 26640 MB/s

Read: 26340 MB/s

FhGFS

Write: 26247 MB/s (98,5%)

Read: 24789 MB/s (94,1%)

25247

24789

0

5000

10000

15000

20000

25000

0 2 4 6 8 10 12 14 16 18 20

MB

/s

# Storage servers

Write

Read

18

Streaming Throughput (3)

25409

26649

4096

8192

16384

6 12 24 48 96 192 384 768

MB

/s

# Clients

Sequential Read/Write,

20 servers, up to 768 client procs

Write

Read

19

Shared file access (1)

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

0 2 4 6 8 10 12 14 16 18 20

MB

/s

# Servers

Sequential I/O, 1 shared file, 600k block size

up to 20 servers, 192 client procs

Write

Read

20

Shared file access (2)

6000

7000

8000

9000

10000

11000

12000

13000

14000

12 24 48 96 192 384 768

MB

/s

# Clients

Sequential write, 1 shared file

20 servers, up to 768 client procs

21

IOPS

109992

1126963

0

200000

400000

600000

800000

1000000

1200000

2 4 6 8 10 12 14 16 18 20

IOP

S

# Storage servers

IOPS (Random 4k writes)

up to 20 servers, 160 client procs

22

Metadata performance

34693

539724

0

100000

200000

300000

400000

500000

600000

1 2 4 6 8 10 12 14 16 18 20

Cre

ate

/se

c

# MDS

Create

93007

1381339

0

200000

400000

600000

800000

1000000

1200000

1400000

1 2 4 6 8 10 12 14 16 18 20

Sta

t/se

c

# MDS

Stat

File create / stat

up to 20 servers, up to 640 client procs (32*#MDS)

23

Metadata performance (2)

> 500,000 file creates per second

Creation of 1,000,000,000 files: ~ 33 minutes

24

Questions?

http://www.fhgfs.com

http://wiki.fhgfs.com

Fraunhofer Booth

# 643