diff --git a/sections/en-us/os.md b/sections/en-us/os.md index 30ac190..25c0700 100644 --- a/sections/en-us/os.md +++ b/sections/en-us/os.md @@ -1,8 +1,374 @@ # OS * `[Doc]` TTY -* `[Doc]` OS +* `[Doc]` OS (Operating System) * `[Doc]` Command Line Options * `[Basic]` Load * `[Point]` CheckList * `[Basic]` Indicators + +## TTY + +"TTY" means "teletype", a typewriter, and "pty" is "pseudo-teletype", a pseudo typewriter. In Unix, `/dev/tty*` refers to any device that acts as a typewriter, such as the terminal. + +You can view the currently logged in user through the `w` command, and you'll find a new tty every time you login to a window. + +```shell +$ w + 11:49:43 up 482 days, 19:38, 3 users, load average: 0.03, 0.08, 0.07 +USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT +dev pts/0 10.0.128.252 10:44 1:01m 0.09s 0.07s -bash +dev pts/2 10.0.128.252 11:08 2:07 0.17s 0.14s top +root pts/3 10.0.240.2 11:43 7.00s 0.04s 0.00s w +``` + +Using the ps command to see process information, there is also information about tty: + +```shell +$ ps -x + PID TTY STAT TIME COMMAND + 5530 ? S 0:00 sshd: dev@pts/3 + 5531 pts/3 Ss+ 0:00 -bash +11296 ? S 0:00 sshd: dev@pts/4 +11297 pts/4 Ss 0:00 -bash +13318 pts/4 R+ 0:00 ps -x +23733 ? Ssl 2:53 PM2 v1.1.2: God Daemon +``` + +The process marked with `?` is not depending on TTY, which is called [Daemon](/sections/en-us/process.md#%E5%AE%88%E6%8A%A4%E8%BF%9B%E7%A8%8B). + +In Node.js, you can use stdio's isTTY attribute to determine whether the current process is in a TTY (such as terminal) environment. + +```shell +$ node -p -e "Boolean(process.stdout.isTTY)" +true +$ node -p -e "Boolean(process.stdout.isTTY)" | cat +false +``` + +## OS + +You can get some auxiliary functions to the basic information of the current system through the OS module. + +|Attribute|Description| +|---|---| +|os.EOL|Returns the current system's `End Of Line`, based on the current system| +|os.arch()|Returns the CPU architecture of the current system, such as `'x86'` or `'x64'`| +|os.constants|Returns system constants| +|os.cpus()|Returns the information for each kernel of the CPU| +|os.endianness()|Returns byte order of CPU, return `BE` if it is big endian, return `LE` if it is little endian.| +|os.freemem()|Returns the size of the system's free memory, in bytes| +|os.homedir()|Returns the root directory of the current user| +|os.hostname()|Returns the hostname of the current system| +|os.loadavg()|Returns load information| +|os.networkInterfaces()|Returns the NIC information (similar to `ifconfig`)| +|os.platform()|Returns the platform information specified at compile time, such as `win32`, `linux`, same as `process.platform()`| +|os.release()|Returns the distribution version number of the operating system| +|os.tmpdir()|Returns the default temporary folder of the system| +|os.totalmem()|Returns the total memory size (the same as the memory bar size)| +|os.type()|Returns the name of the system according to [`uname`](https://en.wikipedia.org/wiki/Uname#Examples)| +|os.uptime()|Returns the running time of the system, in seconds| +|os.userInfo([options])|Returns the current user information| + +> What's the difference between the line breaks (EOL) of different operating systems? + +End of line (EOL) is the same as newline, line ending and line break. + +And it's usually composed of line feed (LF, `\n`) and carriage return (CR, `\r`). Here are some common cases: + +|Symbol|System| +|---|---| +|LF|In Unix or Unix compatible systems (GNU/Linux, AIX, Xenix, Mac OS X, ...), BeOS, Amiga, RISC OS| +|CR+LF|MS-DOS, Microsoft Windows, Most non Unix systems| +|CR|Apple II family, Mac OS to version 9| + +If you don't understand the cross-system compatibility of EOL, you might have problems dealing with the line segmentation/row statistics of the file. + +### OS Constants + +* Signal Constants, such as `SIGHUP`, `SIGKILL`, etc. +* POSIX Error Constants, such as `EACCES`, `EADDRINUSE`, etc. +* Windows Specific Error Constants, such as `WSAEACCES`, `WSAEBADF`, etc. +* libuv Constants, only `UV_UDP_REUSEADDR`. + + +## Path + +The built-in path in Node.js is a module for handling path problems, but as we all know, the paths are irreconcilable in different operating systems. + +### Windows vs. POSIX + +|POSIX|Value|Windows|Value| +|---|---|---|---| +|path.posix.sep|`'/'`|path.win32.sep|`'\\'`| +|path.posix.normalize('/foo/bar//baz/asdf/quux/..')|`'/foo/bar/baz/asdf'`|path.win32.normalize('C:\\temp\\\\foo\\bar\\..\\')|`'C:\\temp\\foo\\'`| +|path.posix.basename('/tmp/myfile.html')|`'myfile.html'`|path.win32.basename('C:\\temp\\myfile.html')|`'myfile.html'`| +|path.posix.join('/asdf', '/test.html')|`'/asdf/test.html'`|path.win32.join('/asdf', '/test.html')|`'\\asdf\\test.html'`| +|path.posix.relative('/root/a', '/root/b')|`'../b'`|path.win32.relative('C:\\a', 'c:\\b')|`'..\\b'` +|path.posix.isAbsolute('/baz/..')|`true`|path.win32.isAbsolute('C:\\foo\\..')|`true`| +|path.posix.delimiter|`':'`|path.win32.delimiter|`','`| +|process.env.PATH|`'/usr/bin:/bin'`|process.env.PATH|`C:\Windows\system32;C:\Program Files\node\'`| +|PATH.split(path.posix.delimiter)|`['/usr/bin', '/bin']`|PATH.split(path.win32.delimiter)|`['C:\\Windows\\system32', 'C:\\Program Files\\node\\']`| + + +After looking at the table, you should realize that when under a certain platform, the `path` module is actually the method of the corresponding platform. For example, I uses Mac here, so: + +```javascript +const path = require('path'); +console.log(path.basename === path.posix.basename); // true +``` + +If you are on one of these platforms, but you need to deal with the path of another platform, you need to be aware of this cross platform issue. + +### path Object + +on POSIX: + +```javascript +path.parse('/home/user/dir/file.txt') +// Returns: +// { +// root : "/", +// dir : "/home/user/dir", +// base : "file.txt", +// ext : ".txt", +// name : "file" +// } +``` + +```javascript +┌─────────────────────┬────────────┐ +│ dir │ base │ +├──────┬ ├──────┬─────┤ +│ root │ │ name │ ext │ +" / home/user/dir / file .txt " +└──────┴──────────────┴──────┴─────┘ +``` + +on Windows: + +```javascript +path.parse('C:\\path\\dir\\file.txt') +// Returns: +// { +// root : "C:\\", +// dir : "C:\\path\\dir", +// base : "file.txt", +// ext : ".txt", +// name : "file" +// } +``` + +```javascript +┌─────────────────────┬────────────┐ +│ dir │ base │ +├──────┬ ├──────┬─────┤ +│ root │ │ name │ ext │ +" C:\ path\dir \ file .txt " +└──────┴──────────────┴──────┴─────┘ +``` + +### path.extname(path) + +|case|return| +|---|---| +|path.extname('index.html')|`'.html'`| +|path.extname('index.coffee.md')|`'.md'`| +|path.extname('index.')|`'.'`| +|path.extname('index')|`''`| +|path.extname('.index')|`''`| + + +## Command Line Options + +Command Line Options is some documentation on the use of CLI. There are 4 main ways of using CLI: + +* node [options] [v8 options] [script.js | -e "script"] [arguments] +* node debug [script.js | -e "script" | :] … +* node --v8-options +* Starts REPL environment without parameters directly + +### Options + +|Parameter|Introduction| +|---|---| +|-v, --version|Shows the version of current node| +|-h, --help|Shows help documentation| +|-e, --eval "script"|The parameter string is executed as code +|-p, --print "script"|Prints the return value of `-e` +|-c, --check|Checks syntax without executing the code +|-i, --interactive|Opens REPL mode even if stdin is not the terminal +|-r, --require module|`require` the Specified module before startup +|--no-deprecation|Closes the scrap module warning +|--trace-deprecation|Prints stack trace information for an obsolete module +|--throw-deprecation|Throws errors while executing an obsolete module +|--no-warnings|Ignores warnings (including obsolete warnings) +|--trace-warnings|Prints warning stack (including discarded modules) +|--trace-sync-io|As soon as the asynchronous I/O is detected at the beginning of the event loop, the stack trace will be printed +|--zero-fill-buffers|Zero-fill **Buffer** and **SlowBuffer** +|--preserve-symlinks|Instructs the module loader to save symbolic links when parsing and caching modules +|--track-heap-objects|Tracks the allocation of heap objects for heap snapshot +|--prof-process|Using the V8 option `--prof` to generate the Profilling Report +|--v8-options|Shows the V8 command line options +|--tls-cipher-list=list|Specifies the list of alternative default TLS encryption devices +|--enable-fips|Turns on FIPS-compliant crypto at startup +|--force-fips|Enforces FIPS-compliant at startup +|--openssl-config=file|Loads the OpenSSL configuration file at startup +|--icu-data-dir=file|Specifies the loading path of ICU data + +### Environment Variable + +|Environment variable|Introduction| +|----|----| +|`NODE_DEBUG=module[,…]`|Specifies the list of core modules to print debug information +|`NODE_PATH=path[:…]`|Specifies prefix list of the module search directory +|`NODE_DISABLE_COLORS=1`|Closes the color display for REPL +|`NODE_ICU_DATA=file`|ICU (Intl, object) data path +|`NODE_REPL_HISTORY=file`|Path of persistent storage REPL history file +|`NODE_TTY_UNSAFE_ASYNC=1`|When set to 1, The stdio operation will proceed synchronously (such as console.log becomes synchronous) +|`NODE_EXTRA_CA_CERTS=file`|Specifies an extra certificate path for CA (such as VeriSign) + +## Load + +Load is an important concept to measure the running state of the server. Through the load situation, we can know whether the server is idle, good, busy or about to crash. + +Typically, the load we want to look at is the CPU load, for more information you can read this blog: [Understanding Linux CPU Load](http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages). + +To get the current system load, you can use `uptime`, `top` command in terminal or `os.loadavg()` in Node.js: + +``` +load average: 0.09, 0.05, 0.01 +``` + +Here are the average load on the system of the last 1 minutes, 5 minutes, 15 minutes. When one of the CPU core is working in full load, the value of load will be 1, so the value represents how many CPU cores are in full load. + +In Node.js, the CPU load of a single process can be viewed using the [pidusage](https://github.com/soyuka/pidusage) module. + +In addition to the CPU load, the server (prefer maintain) needs to know about the network load, disk load, and so on. + +## CheckList + +> A police officer sees a drunken man intently searching the ground near a lamppost and asks him the goal of his quest. The inebriate replies that he is looking for his car keys, and the officer helps for a few minutes without success then he asks whether the man is certain that he dropped the keys near the lamppost. +“No,” is the reply, “I lost the keys somewhere across the street.” “Why look here?” asks the surprised and irritated officer. “The light is much better here,” the intoxicated man responds with aplomb. + +When it comes to checking server status, many server-side friends only know how to use the `top` command. In fact, the situation is the same as the jokes above, because `top` is the brightest street lamp for them. + +For server-side programmers, the full server-side checklist is the [USE Method](http://www.brendangregg.com/USEmethod/use-linux.html) described in the second chapter of [《Systems Performance》](https://www.amazon.cn/%E5%9B%BE%E4%B9%A6/dp/0133390098). + +The USE Method provides a strategy for performing a complete check of system health, identifying common bottlenecks and errors. For each system resource, metrics for utilization, saturation and errors are identified and checked. Any issues discovered are then investigated using further strategies. + +This is an example USE-based metric list for Linux operating systems (eg, Ubuntu, CentOS, Fedora). This is primarily intended for system administrators of the physical systems, who are using command line tools. Some of these metrics can be found in remote monitoring tools. + +### Physical Resources + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
componenttypemetric
CPUutilizationsystem-wide: vmstat 1, "us" + "sy" + "st"; sar -u, sum fields except "%idle" and "%iowait"; dstat -c, sum fields except "idl" and "wai"; per-cpu: mpstat -P ALL 1, sum fields except "%idle" and "%iowait"; sar -P ALL, same as mpstat; per-process: top, "%CPU"; htop, "CPU%"; ps -o pcpu; pidstat 1, "%CPU"; per-kernel-thread: top/htop ("K" to toggle), where VIRT == 0 (heuristic). [1]
CPUsaturationsystem-wide: vmstat 1, "r" > CPU count [2]; sar -q, "runq-sz" > CPU count; dstat -p, "run" > CPU count; per-process: /proc/PID/schedstat 2nd field (sched_info.run_delay); perf sched latency (shows "Average" and "Maximum" delay per-schedule); dynamic tracing, eg, SystemTap schedtimes.stp "queued(us)" [3]
CPUerrorsperf (LPE) if processor specific error events (CPC) are available; eg, AMD64's "04Ah Single-bit ECC Errors Recorded by Scrubber" [4]
Memory capacityutilizationsystem-wide: free -m, "Mem:" (main memory), "Swap:" (virtual memory); vmstat 1, "free" (main memory), "swap" (virtual memory); sar -r, "%memused"; dstat -m, "free"; slabtop -s c for kmem slab usage; per-process: top/htop, "RES" (resident main memory), "VIRT" (virtual memory), "Mem" for system-wide summary
Memory capacitysaturationsystem-wide: vmstat 1, "si"/"so" (swapping); sar -B, "pgscank" + "pgscand" (scanning); sar -W; per-process: 10th field (min_flt) from /proc/PID/stat for minor-fault rate, or dynamic tracing [5]; OOM killer: dmesg | grep killed
Memory capacityerrorsdmesg for physical failures; dynamic tracing, eg, SystemTap uprobes for failed malloc()s
Network Interfacesutilizationsar -n DEV 1, "rxKB/s"/max "txKB/s"/max; ip -s link, RX/TX tput / max bandwidth; /proc/net/dev, "bytes" RX/TX tput/max; nicstat "%Util" [6]
Network Interfacessaturationifconfig, "overruns", "dropped"; netstat -s, "segments retransmited"; sar -n EDEV, *drop and *fifo metrics; /proc/net/dev, RX/TX "drop"; nicstat "Sat" [6]; dynamic tracing for other TCP/IP stack queueing [7]
Network Interfaceserrorsifconfig, "errors", "dropped"; netstat -i, "RX-ERR"/"TX-ERR"; ip -s link, "errors"; sar -n EDEV, "rxerr/s" "txerr/s"; /proc/net/dev, "errs", "drop"; extra counters may be under /sys/class/net/...; dynamic tracing of driver function returns 76]
Storage device I/Outilizationsystem-wide: iostat -xz 1, "%util"; sar -d, "%util"; per-process: iotop; pidstat -d; /proc/PID/sched "se.statistics.iowait_sum"
Storage device I/Osaturationiostat -xnz 1, "avgqu-sz" > 1, or high "await"; sar -d same; LPE block probes for queue length/latency; dynamic/static tracing of I/O subsystem (incl. LPE block probes)
Storage device I/Oerrors/sys/devices/.../ioerr_cnt; smartctl; dynamic/static tracing of I/O subsystem response codes [8]
Storage capacityutilizationswap: swapon -s; free; /proc/meminfo "SwapFree"/"SwapTotal"; file systems: "df -h"
Storage capacitysaturationnot sure this one makes sense - once it's full, ENOSPC
Storage capacityerrorsstrace for ENOSPC; dynamic tracing for ENOSPC; /var/log/messages errs, depending on FS
Storage controllerutilizationiostat -xz 1, sum devices and compare to known IOPS/tput limits per-card
Storage controllersaturationsee storage device saturation, ...
Storage controllererrorssee storage device errors, ...
Network controllerutilizationinfer from ip -s link (or /proc/net/dev) and known controller max tput for its interfaces
Network controllersaturationsee network interface saturation, ...
Network controllererrorssee network interface errors, ...
CPU interconnectutilizationLPE (CPC) for CPU interconnect ports, tput / max
CPU interconnectsaturationLPE (CPC) for stall cycles
CPU interconnecterrorsLPE (CPC) for whatever is available
Memory interconnectutilizationLPE (CPC) for memory busses, tput / max; or CPI greater than, say, 5; CPC may also have local vs remote counters
Memory interconnectsaturationLPE (CPC) for stall cycles
Memory interconnecterrorsLPE (CPC) for whatever is available
I/O interconnectutilizationLPE (CPC) for tput / max if available; inference via known tput from iostat/ip/...
I/O interconnectsaturationLPE (CPC) for stall cycles
I/O interconnecterrorsLPE (CPC) for whatever is available
+ + +### Software Resources + + + + + + + + + + + + + + + + + + + + +
componenttypemetric
Kernel mutexutilizationWith CONFIG_LOCK_STATS=y, /proc/lock_stat "holdtime-totat" / "acquisitions" (also see "holdtime-min", "holdtime-max") [8]; dynamic tracing of lock functions or instructions (maybe)
Kernel mutexsaturationWith CONFIG_LOCK_STATS=y, /proc/lock_stat "waittime-total" / "contentions" (also see "waittime-min", "waittime-max"); dynamic tracing of lock functions or instructions (maybe); spinning shows up with profiling (perf record -a -g -F 997 ..., oprofile, dynamic tracing)
Kernel mutexerrorsdynamic tracing (eg, recusive mutex enter); other errors can cause kernel lockup/panic, debug with kdump/crash
User mutexutilizationvalgrind --tool=drd --exclusive-threshold=... (held time); dynamic tracing of lock to unlock function time
User mutexsaturationvalgrind --tool=drd to infer contention from held time; dynamic tracing of synchronization functions for wait time; profiling (oprofile, PEL, ...) user stacks for spins
User mutexerrorsvalgrind --tool=drd various errors; dynamic tracing of pthread_mutex_lock() for EAGAIN, EINVAL, EPERM, EDEADLK, ENOMEM, EOWNERDEAD, ...
Task capacityutilizationtop/htop, "Tasks" (current); sysctl kernel.threads-max, /proc/sys/kernel/threads-max (max)
Task capacitysaturationthreads blocking on memory allocation; at this point the page scanner should be running (sar -B "pgscan*"), else examine using dynamic tracing
Task capacityerrors"can't fork()" errors; user-level threads: pthread_create() failures with EAGAIN, EINVAL, ...; kernel: dynamic tracing of kernel_thread() ENOMEM
File descriptorsutilizationsystem-wide: sar -v, "file-nr" vs /proc/sys/fs/file-max; dstat --fs, "files"; or just /proc/sys/fs/file-nr; per-process: ls /proc/PID/fd | wc -l vs ulimit -n
File descriptorssaturationdoes this make sense? I don't think there is any queueing or blocking, other than on memory allocation.
File descriptorserrorsstrace errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...).
+ +#### ulimit + +ulimit is used to manage user access to system resources. + +``` +-a All current limits are reported +-c The maximum size of core files created, take block as a unit +-d The maximum size of a process's data segment, take KB as a unit +-f The maximum size of files written by the shell and its children, take block as a unit +-H Set a hard limit to the resource, that is the limits set by the administrator +-m The maximum resident set size, take KB as a unit +-n The maximum number of open file descriptors at the same time +-p The pipe size in 512-byte blocks, take 512-byte as a unit +-s The maximum stack size, take KB as a unit +-S Set flexible limits for resources +-t The maximum amount of cpu time, in seconds +-u The maximum number of processes available to a single user +-v The maximum amount of virtual memory available to the shell, take KB as a unit +``` + +For example: + +``` +$ ulimit -a +core file size (blocks, -c) 0 +data seg size (kbytes, -d) unlimited +scheduling priority (-e) 0 +file size (blocks, -f) unlimited +pending signals (-i) 127988 +max locked memory (kbytes, -l) 64 +max memory size (kbytes, -m) unlimited +open files (-n) 655360 +pipe size (512 bytes, -p) 8 +POSIX message queues (bytes, -q) 819200 +real-time priority (-r) 0 +stack size (kbytes, -s) 8192 +cpu time (seconds, -t) unlimited +max user processes (-u) 4096 +virtual memory (kbytes, -v) unlimited +file locks (-x) unlimited +``` + +Note: open socket and other resources are also kind of file descriptor, if `ulimit -n` is too small, not only will you not open the file, but also can not establish a socket link. \ No newline at end of file