Update and rename README.md to README-EN.md

This commit is contained in:
Morsmalleo 2022-06-01 09:44:09 +08:00 committed by GitHub
parent 95ee8edef1
commit 4cc87ef6fc
2 changed files with 175 additions and 174 deletions

175
README-EN.md Normal file
View File

@ -0,0 +1,175 @@
# What is Ip2region?
ip2region - offline IP address location library with 99.9% accuracy, 0.0x millisecond queries, ip2region.db database is only a few MB, provides query bindings for java,php,c,python,nodejs,golang,c# and three query algorithms for Binary,B-tree,memory.
# Ip2region features
### 99.9% accuracy rate
The data is aggregated from some well-known ip to place name lookup providers, these are their official accuracy rates, tested to be actually a bit more accurate than the classic pure IP location. <br
The data of ip2region is aggregated from the open API or data of the following service providers (2 to 4 requests per second for the upgrade program): <br />
01, &gt;80%, Taobao IP address database, [http://ip.taobao.com/](http://ip.taobao.com/) <br />
02, ≈10%, GeoIP, [https://geoip.com/](https://geoip.com/) <br />
03, ≈2%, Pure IP Library, [http://www.cz88.net/](http://www.cz88.net/) <br />
<b>Remarks:</b> If none of the above open APIs or data give open data ip2region will stop the update service for the data.
### Standardized data format
Each ip data segment has a fixed format of.
```
_cityId|country|region|province|city|ISP_
```
Only the data for China is accurate to the city, some of the data for other countries can only be located to the country, and all the options before the latter are 0, which already contains all the big and small countries you can check (please ignore the city Id in front, personal project requirements).
### Small size
Contains all the IPs, the generated database file ip2region.db is only a few MB, the smallest version is only 1.5 MB, with the increase in the level of detail of the data the size of the database is slowly increasing, currently not more than 8 MB.
### Fast query speed
All of the query client single query are in the 0.x millisecond level, built-in three query algorithms
1. memory algorithm: the entire database is loaded into memory, single query are within 0.1x milliseconds, C language client single query in 0.00x milliseconds level.
2. binary algorithm: based on dichotomous lookup, based on ip2region.db file, no need to load memory, single query in 0.x milliseconds level.
3. b-tree algorithm: based on btree algorithm, based on ip2region.db file, no need to load memory, word query at 0.x milliseconds level, faster than binary algorithm.
Any client b-tree is faster than binary algorithm, of course memory algorithm is certainly the fastest!
### Multi-query client support
Clients already integrated are: java, C#, php, c, python, nodejs, php extensions (php5 and php7), golang, rust, lua, lua_c, nginx.
binding | description | development status | binary query time consuming | b-tree query time consuming | memory query time consuming
:-: | :-: | :-: | :-: | :-: | :-:
[c](binding/c) | ANSC c binding | completed | 0.0x ms | 0.0x ms | 0.00x ms
[c#](binding/c#) | c# binding | completed | 0.x ms | 0.x ms | 0.1x ms
[golang](binding/golang) | golang binding | done | 0.x ms | 0.x ms | 0.1x ms
[java](binding/java) | java binding | completed | 0.x ms | 0.x ms | 0.1x ms
[lua](binding/lua) | lua implementation of binding | done | 0.x ms | 0.x ms | 0.x ms
[lua_c](binding/lua_c) | lua's c extension | done | 0.0x ms | 0.0x ms | 0.00x ms
[nginx](binding/nginx) | c extensions for nginx | done | 0.0x ms | 0.0x ms | 0.00x ms
[nodejs](binding/nodejs) | nodejs | completed | 0.x ms | 0.x ms | 0.1x ms
[php](binding/php) | php implementation of binding | done | 0.x ms | 0.1x ms | 0.1x ms
[php5_ext](binding/php5_ext) | c extensions for php5 | done | 0.0x ms | 0.0x ms | 0.00x ms
[php7_ext](binding/php7_ext) | c extensions for php7 | done | 0.0x ms | 0.0x ms | 0.00x ms
[python](binding/python) | python bindng | done | 0.x ms | 0.x ms | 0.x ms
[rust](binding/rust) | rust binding | done | 0.x ms | 0.x ms | 0.x ms
# ip2region quick test
Please refer to the README instructions under each binding to run the cli test program, for example the C demo runs as follows.
```shell
cd binding/c/
gcc -g -O2 testSearcher.c ip2region.c
. /a.out ... /... /data/ip2region.db
```
You will see the following cli interface.
```shell
initializing B-tree ...
+----------------------------------+
| ip2region test script |
| Author: chenxin619315@gmail.com |
| Type 'quit' to exit program |
+----------------------------------+
p2region>> 101.105.35.57
2163|China|South|Guangdong|Shenzhen|Dr. Peng in 0.02295 millseconds
```
Enter the IP address to start the test, the first time will be slightly slow, after the run command access binary,memory to try other algorithms, it is recommended to use the b-tree algorithm, speed and concurrency requirements can use the memory algorithm, please refer to the test source code under different binding for specific integration.
# ip2region installation
Refer to the README documentation and test demos under each binding for details, and here are some available shortcut installations.
### maven repository address
```xml
<dependency>
<groupId>org.lionsoul</groupId>
<artifactId>ip2region</artifactId>
<version>1.7.2</version>
</dependency>
```
### nodejs
```
npm install node-ip2region --save
```
### nuget install
```shell
Install-Package IP2Region
```
### php composer
```shell
# Plugin from: https://github.com/zoujingli/ip2region
composer require zoujingli/ip2region
```
# ip2region Concurrent use
1. each search interface of all binding is <b>not</b> a thread-safe implementation, different threads can use it by creating different query objects. with high concurrency, the binary and b-tree algorithms may have too many open files error, please modify the kernel's maximum number of open files allowed (fs.file-max=a higher value), or use a persistent memory algorithm.
2. memorySearch interface, which performs a pre-query before publishing the object (essentially loading the ip2region.db file into memory), can be safely used in a multi-threaded environment.
# ip2region.db generation
Starting from version 1.8, ip2region has open sourced the java implementation of the ip2region.db generator, providing ant compilation support, which will result in the dbMaker-{version}.jar mentioned below, for those who need to study the generator or change the custom generation configuration please refer to ${ip2region_ root}/maker/java for the java source code.
Starting with ip2region version 1.2.2 there is an executable jar file dbMaker-{version}.jar committed inside, which is used to do this.
1. make sure you have the java environment installed (kids who don't play with Java will Google themselves to find a pull, temporary use, a matter of minutes)
2. cd to ${ip2region_root}/maker/java and run the following command.
```shell
java -jar dbMaker-{version}.jar -src text data file -region geographic csv file [-dst directory of the generated ip2region.db file]
# text data file: the original text data file path of the db file, the self-contained ip2region.db file is generated from /data/ip.merge.txt, you can replace it with your own or change /data/ip.merge.txt to regenerate it
# geographic csv file: the purpose of this file is to facilitate the configuration of ip2region for data relationship storage, the data obtained contains a city_id, this can be directly used /data/origin/global_region.csv file
# ip2region.db file directory: is optional, if not specified, a copy will be generated in the current directory. /data/ip2region.db file
```
3. get the generated ip2region.db file to overwrite the original ip2region.db file
4. The default ip2region.db file generation command:
```shell
cd ${ip2region_root}/java/
java -jar dbMaker-1.2.2.jar -src . /data/ip.merge.txt -region . /data/global_region.csv
# You'll see a big chunk of output
```
# Related remarks
### Declaration
ip2region focuses on <b>researching IP data storage design and query implementation in various languages</b>, there is no original IP data support, please refer to the description above for data sources, upgrading data requires a lot of IP support and will cause a certain amount of request pressure on the original platform, this project does not guarantee timely data updates, there is no and will not be a commercial version, you can use You can use the custom data import ip2region for custom query implementation.
### Technical Communication
1. the structure and principle of the database file please read @冬芽's blog: ["ip2region database file structure and principle"](https://github.com/dongyado/dongyado.github.io/blob/ master/_posts/2016-08-18-structure-of-ip2region-database-file.md), [ip2region data structure design and implementation video share](https://www.bilibili.com/video/ BV1wv4y1N7SD)
2. ip2region exchange and sharing, WeChat: lionsoul2014 (please note ip2region), QQ: 1187582057 (little attention)
3. based on the detection algorithm of the data update way video sharing: [data update to achieve video sharing part1](https://www.bilibili.com/video/BV1934y1E7Q5/), [data update to achieve video sharing part2](https://www.bilibili.com/ video/BV1pF411j7Aw/)
Translated with DeepL https://www.deepl.com/app/?utm_medium=android-share

174
README.md
View File

@ -1,174 +0,0 @@
# Ip2region是什么
ip2region - 准确率99.9%的离线IP地址定位库0.0x毫秒级查询ip2region.db数据库只有数MB提供了java,php,c,python,nodejs,golang,c#等查询绑定和Binary,B树,内存三种查询算法。
# Ip2region特性
### 99.9%准确率
数据聚合了一些知名ip到地名查询提供商的数据这些是他们官方的的准确率经测试着实比经典的纯真IP定位准确一些。<br />
ip2region的数据聚合自以下服务商的开放API或者数据(升级程序每秒请求次数2到4次): <br />
01, &gt;80%, 淘宝IP地址库, [http://ip.taobao.com/](http://ip.taobao.com/) <br />
02, ≈10%, GeoIP, [https://geoip.com/](https://geoip.com/) <br />
03, ≈2%, 纯真IP库, [http://www.cz88.net/](http://www.cz88.net/) <br />
<b>备注:</b>如果上述开放API或者数据都不给开放数据时ip2region将停止数据的更新服务。
### 标准化的数据格式
每条ip数据段都固定了格式
```
_城市Id|国家|区域|省份|城市|ISP_
```
只有中国的数据精确到了城市其他国家有部分数据只能定位到国家后前的选项全部是0已经包含了全部你能查到的大大小小的国家请忽略前面的城市Id个人项目需求
### 体积小
包含了全部的IP生成的数据库文件ip2region.db只有几MB最小的版本只有1.5MB随着数据的详细度增加数据库的大小也慢慢增大目前还没超过8MB。
### 查询速度快
全部的查询客户端单次查询都在0.x毫秒级别内置了三种查询算法
1. memory算法整个数据库全部载入内存单次查询都在0.1x毫秒内C语言的客户端单次查询在0.00x毫秒级别。
2. binary算法基于二分查找基于ip2region.db文件不需要载入内存单次查询在0.x毫秒级别。
3. b-tree算法基于btree算法基于ip2region.db文件不需要载入内存单词查询在0.x毫秒级别比binary算法更快。
任何客户端b-tree都比binary算法快当然memory算法固然是最快的
### 多查询客户端的支持
已经集成的客户端有java、C#、php、c、python、nodejs、php扩展(php5和php7)、golang、rust、lua、lua_c, nginx。
binding | 描述 | 开发状态 | binary查询耗时 | b-tree查询耗时 | memory查询耗时
:-: | :-: | :-: | :-: | :-: | :-:
[c](binding/c) | ANSC c binding | 已完成 | 0.0x毫秒 | 0.0x毫秒 | 0.00x毫秒
[c#](binding/c#) | c# binding | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.1x毫秒
[golang](binding/golang) | golang binding | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.1x毫秒
[java](binding/java) | java binding | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.1x毫秒
[lua](binding/lua) | lua实现的binding | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.x毫秒
[lua_c](binding/lua_c) | lua的c扩展 | 已完成 | 0.0x毫秒 | 0.0x毫秒 | 0.00x毫秒
[nginx](binding/nginx) | nginx的c扩展 | 已完成 | 0.0x毫秒 | 0.0x毫秒 | 0.00x毫秒
[nodejs](binding/nodejs) | nodejs | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.1x毫秒
[php](binding/php) | php实现的binding | 已完成 | 0.x毫秒 | 0.1x毫秒 | 0.1x毫秒
[php5_ext](binding/php5_ext) | php5的c扩展 | 已完成 | 0.0x毫秒 | 0.0x毫秒 | 0.00x毫秒
[php7_ext](binding/php7_ext) | php7的c扩展 | 已完成 | 0.0毫秒 | 0.0x毫秒 | 0.00x毫秒
[python](binding/python) | python bindng | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.x毫秒
[rust](binding/rust) | rust binding | 已完成 | 0.x毫秒 | 0.x毫秒 | 0.x毫秒
# ip2region快速测试
请参考每个binding下的README说明去运行cli测试程序例如C语言的demo运行如下
```shell
cd binding/c/
gcc -g -O2 testSearcher.c ip2region.c
./a.out ../../data/ip2region.db
```
会看到如下cli界面
```shell
initializing B-tree ...
+----------------------------------+
| ip2region test script |
| Author: chenxin619315@gmail.com |
| Type 'quit' to exit program |
+----------------------------------+
p2region>> 101.105.35.57
2163|中国|华南|广东省|深圳市|鹏博士 in 0.02295 millseconds
```
输入IP地址开始测试第一次会稍微有点慢在运行命令后面接入binary,memory来尝试其他算法建议使用b-tree算法速度和并发需求的可以使用memory算法具体集成请参考不同binding下的测试源码。
# ip2region安装
具体请参考每个binding下的README文档和测试demo以下是一些可用的快捷安装方式
### maven仓库地址
```xml
<dependency>
<groupId>org.lionsoul</groupId>
<artifactId>ip2region</artifactId>
<version>1.7.2</version>
</dependency>
```
### nodejs
```
npm install node-ip2region --save
```
### nuget安装
```shell
Install-Package IP2Region
```
### php composer
```shell
# 插件来自https://github.com/zoujingli/ip2region
composer require zoujingli/ip2region
```
# ip2region 并发使用
1. 全部binding的各个search接口都<b>不是</b>线程安全的实现不同线程可以通过创建不同的查询对象来使用并发量很大的情况下binary和b-tree算法可能会打开文件数过多的错误请修改内核的最大允许打开文件数(fs.file-max=一个更高的值)或者使用持久化的memory算法。
2. memorySearch接口在发布对象前进行一次预查询(本质上是把ip2region.db文件加载到内存),可以安全用于多线程环境。
# ip2region.db的生成
从1.8版本开始ip2region开源了ip2region.db生成程序的java实现提供了ant编译支持编译后会得到以下提到的dbMaker-{version}.jar对于需要研究生成程序的或者更改自定义生成配置的请参考${ip2region_root}/maker/java内的java源码。
从ip2region 1.2.2版本开始里面提交了一个dbMaker-{version}.jar的可以执行jar文件用它来完成这个工作
1. 确保你安装好了java环境不玩Java的童鞋就自己谷歌找找拉临时用一用几分钟的事情
2. cd到${ip2region_root}/maker/java然后运行如下命令
```shell
java -jar dbMaker-{version}.jar -src 文本数据文件 -region 地域csv文件 [-dst 生成的ip2region.db文件的目录]
# 文本数据文件db文件的原始文本数据文件路径自带的ip2region.db文件就是/data/ip.merge.txt生成而来的你可以换成自己的或者更改/data/ip.merge.txt重新生成
# 地域csv文件该文件目的是方便配置ip2region进行数据关系的存储得到的数据包含一个city_id这个直接使用/data/origin/global_region.csv文件即可
# ip2region.db文件的目录是可选参数没有指定的话会在当前目录生成一份./data/ip2region.db文件
```
3. 获取生成的ip2region.db文件覆盖原来的ip2region.db文件即可
4. 默认的ip2region.db文件生成命令:
```shell
cd ${ip2region_root}/java/
java -jar dbMaker-1.2.2.jar -src ./data/ip.merge.txt -region ./data/global_region.csv
# 会看到一大片的输出
```
# 相关备注
### 声明
ip2region重点在于<b>研究IP数据的存储设计和各种语言的查询实现</b>并没有原始IP数据的支撑数据来源请参考上面的描述升级数据需要很多IP的支撑并且会对原始平台造成一定量的请求压力本项目不保证及时的数据更新没有也不会有商用版本你可以使用自定义的数据导入ip2region进行自定义查询的实现。
### 技术交流
1. 数据库文件的结构和原理请阅读 @冬芽 的blog[“ip2region数据库文件的结构和原理”](https://github.com/dongyado/dongyado.github.io/blob/master/_posts/2016-08-18-structure-of-ip2region-database-file.md)[ip2region数据结构设计和实现视频分享](https://www.bilibili.com/video/BV1wv4y1N7SD)
2. ip2region交流分享微信lionsoul2014(请备注ip2region)QQ1187582057(很少关注)
3. 基于检测算法的数据更新方式视频分享:[数据更新实现视频分享part1](https://www.bilibili.com/video/BV1934y1E7Q5/)[数据更新实现视频分享part2](https://www.bilibili.com/video/BV1pF411j7Aw/)