User Tools

Site Tools


public:it:zfs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
public:it:zfs [2022/04/13 09:50] – [ZFS] philpublic:it:zfs [Unknown date] (current) – removed - external edit (Unknown date) 127.0.0.1
Line 1: Line 1:
-====== ZFS ====== 
  
-  * [[https://jrs-s.net/2018/08/17/zfs-tuning-cheat-sheet/|ZFS Cheat Sheet]] 
-  * https://github.com/jimsalterjrs/ioztat 
-  * https://www.reddit.com/r/zfs/comments/tmio9p/recordsize_1m_ok_for_generaluse_datasets_for_home/ 
- 
- 
-  * https://blog.uptrace.dev/posts/ubuntu-install-zfs.html 
-  * https://github.com/ddebeau/zfs_uploader 
- 
-  * [[https://www.45drives.com/community/articles/zfs-caching/| What is ZIL, SLOG, ARC, L2ARC?]] 
-    * https://www.servethehome.com/what-is-the-zfs-zil-slog-and-what-makes-a-good-one/ 
- 
-  * [[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html| Tuning Params]] 
-  * https://launchpad.net/%7Ejonathonf/+archive/ubuntu/zfs 
-  * [[https://www.servethehome.com/an-introduction-to-zfs-a-place-to-start/|ZFS Primer]] 
-  * [[https://klarasystems.com/articles/openzfs-using-zpool-iostat-to-monitor-pool-perfomance-and-health/|dRAID]] 
-  * [[https://gist.github.com/papamoose/826a975b19f10926360c032cd982687c| OWC Thunderbay boltd, zfs, docker import and start script]] 
-  * [[https://j-keck.github.io/zfs-snap-diff/docs/zfs-snap-diff/|zfs-snap-diff]] 
-  * [[https://github.com/chadmiller/zpool-iostat-viz| Vizualize Zpool iostat]] 
-  * [[https://gist.github.com/papamoose/741ef99b54211cb646c656afef850f5d| TODO: Modify this script to send zed notifications to Discord or other webhooks.]] 
-  * [[https://github.com/mvgijssel/setup/wiki/Benchmarking-ZFS---NFS|Benchmarking ZFS NFS]] 
- 
- 
-===== My Bug Fixes ===== 
- 
-  * https://github.com/openzfs/zfs/pull/13049 
- 
-===== PostgresQL on ZFS ===== 
- 
-  * https://vadosware.io/post/everything-ive-seen-on-optimizing-postgres-on-zfs-on-linux/ 
-    * [[https://web.archive.org/web/20211221132441/https://vadosware.io/post/everything-ive-seen-on-optimizing-postgres-on-zfs-on-linux/|archive.org]] 
- 
-===== Replace a drive ===== 
-It's just two commands. There are many ways to specify the drive. 
- 
-The general format is: ''%% zfs <command> <pool name> <old drive> [<new drive>] %%'' 
- 
-<code> 
-zfs offline tank ata-ST3300831A_5NF0552X 
-</code> 
-Now physically replace the drive. 
-<code> 
-zfs replace tank ata-ST3300831A_5NF0552X /dev/disk/by-id/<new drive> 
-</code> 
- 
-Check the status of the zpool. You will see output that says the drive is being replaced. 
-<code> 
-zpool status 
-</code> 
- 
-1: https://askubuntu.com/questions/305830/replacing-a-dead-disk-in-a-zpool 
- 
-===== IO Error ===== 
-As an FYI about what to do about this. 
- 
-> On 01/29/2018 01:12 PM, root wrote: 
-> ZFS has detected an io error: 
- 
->     eid: 295 
->   class: io 
->    host: backup1 
->    time: 2018-01-29 13:12:11-0600 
->   vtype: disk 
->   vpath: /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NV9YL-part1 
->   vguid: 0xBC64F09126CA52D8 
->   cksum: 0 
->    read: 0 
->   write: 0 
->    pool: tank 
- 
-<code> 
-root at backup1:~# zpool status 
-  pool: tank 
- state: ONLINE 
-status: One or more devices has experienced an unrecoverable error.  An 
-        attempt was made to correct the error.  Applications are unaffected. 
-action: Determine if the device needs to be replaced, and clear the errors 
-        using 'zpool clear' or replace the device with 'zpool replace'. 
-   see: http://zfsonlinux.org/msg/ZFS-8000-9P 
-  scan: scrub repaired 36K in 82h10m with 0 errors on Wed Jan 17 10:34:24 2018 
-config: 
- 
-        NAME                                          STATE     READ WRITE CKSUM 
-        tank                                          ONLINE               0 
-          raidz2-0                                    ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX11DA40HCD2  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D1404379  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NV9YL  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAAC  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAE9  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A28YZ  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2AZK  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2H5K  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2KTJ  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2NU9  ONLINE              29 
- 
-errors: No known data errors 
-</code> 
- 
-1. The link (http://zfsonlinux.org/msg/ZFS-8000-9P/) it provides has useful info. You should read it. 
- 
-2. root at backup1:~# smartctl -a /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2NU9 | less 
-SMART values 5,197,198,(187,188 are unavailable) are all at 0. So we shall proceed under the assumption that the disk is fine. 
- 
-3. ZFS did recover, so there doesn't seem to be anything to do. Probably it was a bit flip, so we can safely clear the error. 
-<code> 
-root at backup1:~# zpool clear tank ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2NU9 
-root at backup1:~# zpool status 
-  pool: tank 
- state: ONLINE 
-  scan: scrub repaired 36K in 82h10m with 0 errors on Wed Jan 17 10:34:24 2018 
-config: 
- 
-        NAME                                          STATE     READ WRITE CKSUM 
-        tank                                          ONLINE               0 
-          raidz2-0                                    ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX11DA40HCD2  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D1404379  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NV9YL  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAAC  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAE9  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A28YZ  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2AZK  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2H5K  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2KTJ  ONLINE               0 
-            ata-WDC_WD60PURX-64LZMY0_WD-WX31D65A2NU9  ONLINE               0 
- 
-errors: No known data errors 
-</code> 
- 
-4. If errors keep happening to this particular drive ZFS will mark it as faulty, and when/if that happens we just replace the disk. 
- 
- 
-More info: 
- 
-Looking at the previous emails we have the following IO errors reported by ZED on backup1. 
-<code> 
-2018-01-29: /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NV9YL-part1 
-2018-03-14: /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAAC-part1 
-2018-04-10: /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NV9YL-part1 
-2018-05-10: /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAAC-part1 
-2018-05-17: /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAE9-part1 
-</code> 
- 
-Taking this last IO error as an example: 
-<code> 
-root@backup1:~# smartctl -A /dev/disk/by-id/ata-WDC_WD60PURX-64LZMY0_WD-WX21D65NVAE9 | grep -E ' 5|^197|^198|^187|^188' 
-  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always             0 
-197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always             0 
-198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0 
-</code> 
- 
-Because SMART values differ from drive to drive / manufacturer to manufacturer interpreting SMART values is HARD! Based on the Backblaze data releases we can use the SMART values 5,187,188,197,198 as a base line. If any of these RAW values is above 0 the likelihood that the drive will fail increases (not linear). Because the RAW values on this particular drive are 0, this IO error is probably nothing to worry about (zfs did recover). 
- 
-Probably we are dealing with firmware that isn't optimized for how we are using these drives. Also they are cheaper drives so I would expect these IO errors to be normal for these WD purples. 
- 
-There is a scrub still running so we should find out more when it's done. 
-<code> 
-root@backup1:~# zpool status | head -n5 
-  pool: tank 
- state: ONLINE 
-  scan: scrub in progress since Sun May 13 00:24:01 2018 
-    12.9T scanned out of 19.0T at 35.7M/s, 49h46m to go 
-    0 repaired, 67.97% done 
-</code> 
- 
- 
- 
-===== send/recv tools ====== 
- 
-==== Using unprivileged accounts ==== 
- 
-  * [[https://www.reddit.com/r/zfs/comments/u2qsk6/using_unprivileged_accounts_with_syncoid_and/| Using unprivileged accounts with Syncoid (and other ZFS replication tools) 
-]] 
-<code> 
-root@sendbox:~# zfs allow senduser send,hold pool/ds 
-root@recvbox:~# zfs allow recvuser receive,create,mount pool 
-</code> 
-==== sanoid/syncoid ==== 
- 
-==== zrepl ==== 
- 
-==== znapzend ==== 
- 
-  * [[https://github.com/asciiphil/check_znapzend| Check znapzend script ]] 
- 
- 
-=== Bugs === 
-  * https://www.reddit.com/r/zfs/comments/9jsn9c/notes_on_how_to_disable_the_cache_file_in_zol/ 
-  * https://github.com/openzfs/zfs/issues/8885 
- 
- 
- 
-==== Monitor ==== 
-  * https://www.reddit.com/r/zfs/comments/k0vb7v/monitoring_zfs_for_pool_status_drive_failures/ 
- 
- 
-===== zfs holds ===== 
- 
-Prevent snapshots from being destroyed. https://docs.oracle.com/cd/E19253-01/819-5461/gjdfk/index.html 
- 
- 
- 
-List all holds in all pools. 
-<code> 
-zfs get -Ht snapshot userrefs | grep -v $'\t'0 | cut -d $'\t' -f 1 | tr '\n' '\0' | xargs -0 zfs holds 
-</code> 
- 
-List holds in specific dataset. 
-<code> 
-zfs list -H -r -d 1 -t snapshot -o name nameoffilesystem | xargs zfs holds 
-</code> 
- 
-===== zfs snapshots - show space used ===== 
- 
-<code> 
-#!/bin/bash 
-{ echo -e "NAME\tUSED"; 
-for snapshot in $(zfs list -Hpr "$1" -t snapshot -o name -s creation -d 1); do 
-        echo -ne "${snapshot/@/@%}\t" 
-        zfs destroy -nv "${snapshot/@/@%}" | sed -nre "s/would reclaim (.+)/\1/p" 
-done } | column -t 
-</code> 
public/it/zfs.1649861430.txt.gz · Last modified: 2022/04/13 09:50 by phil