您当前的位置: 首页 >  http

宝哥大数据

暂无认证

  • 0浏览

    0关注

    1029博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文

Apache HTTPD和NGINX访问日志解析器

宝哥大数据 发布时间:2021-03-07 11:29:58 ,浏览量:0

这是一个Logparsing框架,旨在简化Apache HTTPD和NGINX访问日志文件的解析。

基本思想是,您应该能够拥有一个解析器,可以通过简单地告诉该行写入了哪些配置选项来构造该解析器。这些配置选项是访问日志行的架构。

github地址:https://github.com/nielsbasjes/logparser

idea安装插件

Lombok能以简单的注解形式来简化java代码,提高开发人员的开发效率。例如开发中经常需要写的javabean,都需要花时间去添加相应的getter/setter,也许还要去写构造器、equals等方法,而且需要维护,当属性多时会出现大量的getter/setter方法,这些显得很冗长也没有太多技术含量,一旦修改属性,就容易出现忘记修改对应方法的失误。

Lombok能通过注解的方式,在编译时自动为属性生成构造器、getter/setter、equals、hashcode、toString方法。出现的神奇就是在源码中没有getter和setter方法,但是在编译生成的字节码文件中有getter和setter方法。这样就省去了手动重建这些代码的麻烦,使代码看起来更简洁些。

安装步骤:

  • 打开idea,进入插件安装窗口
  • 选择从磁盘安装
  • 选择插件所在路径
  • 重启idea即可生效 在这里插入图片描述
导入依赖

    nl.basjes.parse.httpdlog
    httpdlog-parser
    5.2

nginx日志样本

在nginx的conf目录下的nginx.conf文件中可以配置日志打印的格式,如下:

 #log_format  main   '$remote_addr - $remote_user [$time_local] [$msec] 	
 						[$request_time] [$http_host] "$request" '
                    '$status $body_bytes_sent "$request_body" "$http_referer" '
                    '"$http_user_agent" $http_x_forwarded_for'

$remote_addr 对应客户端的地址

$remote_user 是请求客户端请求认证的用户名,如果没有开启认证模块的话是值为空。

$time_local 表示nginx服务器时间

$msec 访问时间与时区字符串形式

$request_time 请求开始到返回时间

$http_host 请求域名

$request 请求的url与http协议

$status 请求状态,如成功200

$body_bytes_sent 表示从服务端返回给客户端的body数据大小

$request_body 访问url时参数

$http_referer 记录从那个页面链接访问过来的

$http_user_agent 记录客户浏览器的相关信息

$http_x_forwarded_for 请求转发f过来的地址

$upstream_response_time:从 Nginx 建立连接 到 接收完数据并关闭连接

范例:

125.88.xxx.xx - - [02/Nov/2018:14:28:49 +0800] [1541140129.431] [0.095] [ma.xx.game.com] “POST /?log3/gameReport HTTP/1.1” 200 358 “type=1&id=NaN” “http://ma.xx.game.com/?log3/gameReport2” “Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36” -

点击日流日志样本
2001-980:91c0:1:8d31:a232:25e5:85d 222.68.172.190 - [05/Sep/2010:11:27:50 +0200] \"GET /images/my.jpg HTTP/1.1\" 404 23617 \"http://www.angularjs.cn/A00n\" \"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; nl-nl) AppleWebKit/533.17.8 (KHTML, like Gecko) Version/5.0.1 Safari/533.17.8\" \"jquery-ui-theme=Eggplant; BuI=SomeThing; Apache=127.0.0.1.1351111543699529\" \"beijingshi\"
定义格式化字符串

参考:https://httpd.apache.org/docs/current/mod/mod_log_config.html

%u %h %l %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\" \"%{Addr}i\"
使用方式

打印日志格式字符串的所有参数

public class Test {
    public static void main(String[] args) throws MissingDissectorsException, NoSuchMethodException, DissectionFailure, InvalidDissectorException {
        new Test().run();
    }

    String logformat = "%u %h %l %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\" \"%{Addr}i\"";
    String logline = "2001-980:91c0:1:8d31:a232:25e5:85d 222.68.172.190 - [05/Sep/2010:11:27:50 +0200] \"GET /images/my.jpg HTTP/1.1\" 404 23617 \"http://www.angularjs.cn/A00n\" \"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; nl-nl) AppleWebKit/533.17.8 (KHTML, like Gecko) Version/5.0.1 Safari/533.17.8\" \"jquery-ui-theme=Eggplant; BuI=SomeThing; Apache=127.0.0.1.1351111543699529\" \"beijingshi\"";

    private void run() throws InvalidDissectorException, MissingDissectorsException, NoSuchMethodException, DissectionFailure {
        //打印日志格式的参数列表
        printAllPossibles(logformat);
}

    private static final Logger LOG = LoggerFactory.getLogger(Test.class);

    //打印所有参数
    private void printAllPossibles(String logformat) throws NoSuchMethodException, MissingDissectorsException, InvalidDissectorException {
        Parser dummyParser = new HttpdLoglineParser(Object.class, logformat);
        List possiblePaths = dummyParser.getPossiblePaths();
        dummyParser.addParseTarget(String.class.getMethod("indexOf", String.class), possiblePaths);

        LOG.info("==================================");
        System.out.println("==================================");
        LOG.info("Possible output:");
        System.out.println("Possible output:");
        for (String path : possiblePaths) {
            System.out.println(path + "     " + dummyParser.getCasts(path));
            LOG.info("{}     {}", path, dummyParser.getCasts(path));
        }
        System.out.println("==================================");
        LOG.info("==================================");
    }
}
入门案例

定义实体对象

public class MyRecord {
    @Getter @Setter private String connectionClientUser = null;
    @Getter @Setter private String connectionClientHost = null;
    @Getter @Setter private String requestReceiveTime   = null;
    @Getter @Setter private String method               = null;
    @Getter @Setter private String referrer             = null;
    @Getter @Setter private String screenResolution     = null;
    @Getter @Setter private String requestStatus        = null;
    @Getter @Setter private String responseBodyBytes    = null;
    @Getter @Setter private Long   screenWidth          = null;
    @Getter @Setter private Long   screenHeight         = null;
    @Getter @Setter private String googleQuery          = null;
    @Getter @Setter private String bui                  = null;
    @Getter @Setter private String useragent            = null;

    @Getter @Setter private String asnNumber            = null;
    @Getter @Setter private String asnOrganization      = null;
    @Getter @Setter private String ispName              = null;
    @Getter @Setter private String ispOrganization      = null;

    @Getter @Setter private String continentName        = null;
    @Getter @Setter private String continentCode        = null;
    @Getter @Setter private String countryName          = null;
    @Getter @Setter private String countryIso           = null;
    @Getter @Setter private String subdivisionName      = null;
    @Getter @Setter private String subdivisionIso       = null;
    @Getter @Setter private String cityName             = null;
    @Getter @Setter private String postalCode           = null;
    @Getter @Setter private Double locationLatitude     = null;
    @Getter @Setter private Double locationLongitude    = null;

    private final Map results = new HashMap(32);

    @Field("STRING:request.firstline.uri.query.*")
    public void setQueryDeepMany(final String name, final String value) {
        results.put(name, value);
    }

    @Field("STRING:request.firstline.uri.query.img")
    public void setQueryImg(final String name, final String value) {
        results.put(name, value);
    }

    @Field("IP:connection.client.host")
    public void setIP(final String value) {
        results.put("IP:connection.client.host", value);
    }

    public String getUser() {
        return results.get("IP:connection.client.host");
    }


    @Field({
            "STRING:connection.client.user",
            "HTTP.HEADER:request.header.addr",
            "IP:connection.client.host.last",
            "TIME.STAMP:request.receive.time.last",
            "TIME.DAY:request.receive.time.last.day",
            "TIME.MONTHNAME:request.receive.time.last.monthname",
            "TIME.MONTH:request.receive.time.last.month",
            "TIME.WEEK:request.receive.time.last.weekofweekyear",
            "TIME.YEAR:request.receive.time.last.weekyear",
            "TIME.YEAR:request.receive.time.last.year",
            "TIME.HOUR:request.receive.time.last.hour",
            "TIME.MINUTE:request.receive.time.last.minute",
            "TIME.SECOND:request.receive.time.last.second",
            "TIME.MILLISECOND:request.receive.time.last.millisecond",
            "TIME.MICROSECOND:request.receive.time.last.microsecond",
            "TIME.NANOSECOND:request.receive.time.last.nanosecond",
            "TIME.DATE:request.receive.time.last.date",
            "TIME.TIME:request.receive.time.last.time",
            "TIME.ZONE:request.receive.time.last.timezone",
            "TIME.EPOCH:request.receive.time.last.epoch",
            "TIME.DAY:request.receive.time.last.day_utc",
            "TIME.MONTHNAME:request.receive.time.last.monthname_utc",
            "TIME.MONTH:request.receive.time.last.month_utc",
            "TIME.WEEK:request.receive.time.last.weekofweekyear_utc",
            "TIME.YEAR:request.receive.time.last.weekyear_utc",
            "TIME.YEAR:request.receive.time.last.year_utc",
            "TIME.HOUR:request.receive.time.last.hour_utc",
            "TIME.MINUTE:request.receive.time.last.minute_utc",
            "TIME.SECOND:request.receive.time.last.second_utc",
            "TIME.MILLISECOND:request.receive.time.last.millisecond_utc",
            "TIME.MICROSECOND:request.receive.time.last.microsecond_utc",
            "TIME.NANOSECOND:request.receive.time.last.nanosecond_utc",
            "TIME.DATE:request.receive.time.last.date_utc",
            "TIME.TIME:request.receive.time.last.time_utc",
            "HTTP.URI:request.referer",
            "HTTP.PROTOCOL:request.referer.protocol",
            "HTTP.USERINFO:request.referer.userinfo",
            "HTTP.HOST:request.referer.host",
            "HTTP.PORT:request.referer.port",
            "HTTP.PATH:request.referer.path",
            "HTTP.QUERYSTRING:request.referer.query",
            "STRING:request.referer.query.*",
            "HTTP.REF:request.referer.ref",
            "TIME.STAMP:request.receive.time",
            "TIME.DAY:request.receive.time.day",
            "TIME.MONTHNAME:request.receive.time.monthname",
            "TIME.MONTH:request.receive.time.month",
            "TIME.WEEK:request.receive.time.weekofweekyear",
            "TIME.YEAR:request.receive.time.weekyear",
            "TIME.YEAR:request.receive.time.year",
            "TIME.HOUR:request.receive.time.hour",
            "TIME.MINUTE:request.receive.time.minute",
            "TIME.SECOND:request.receive.time.second",
            "TIME.MILLISECOND:request.receive.time.millisecond",
            "TIME.MICROSECOND:request.receive.time.microsecond",
            "TIME.NANOSECOND:request.receive.time.nanosecond",
            "TIME.DATE:request.receive.time.date",
            "TIME.TIME:request.receive.time.time",
            "TIME.ZONE:request.receive.time.timezone",
            "TIME.EPOCH:request.receive.time.epoch",
            "TIME.DAY:request.receive.time.day_utc",
            "TIME.MONTHNAME:request.receive.time.monthname_utc",
            "TIME.MONTH:request.receive.time.month_utc",
            "TIME.WEEK:request.receive.time.weekofweekyear_utc",
            "TIME.YEAR:request.receive.time.weekyear_utc",
            "TIME.YEAR:request.receive.time.year_utc",
            "TIME.HOUR:request.receive.time.hour_utc",
            "TIME.MINUTE:request.receive.time.minute_utc",
            "TIME.SECOND:request.receive.time.second_utc",
            "TIME.MILLISECOND:request.receive.time.millisecond_utc",
            "TIME.MICROSECOND:request.receive.time.microsecond_utc",
            "TIME.NANOSECOND:request.receive.time.nanosecond_utc",
            "TIME.DATE:request.receive.time.date_utc",
            "TIME.TIME:request.receive.time.time_utc",
            "HTTP.URI:request.referer.last",
            "HTTP.PROTOCOL:request.referer.last.protocol",
            "HTTP.USERINFO:request.referer.last.userinfo",
            "HTTP.HOST:request.referer.last.host",
            "HTTP.PORT:request.referer.last.port",
            "HTTP.PATH:request.referer.last.path",
            "HTTP.QUERYSTRING:request.referer.last.query",
            "STRING:request.referer.last.query.*",
            "HTTP.REF:request.referer.last.ref",
            "NUMBER:connection.client.logname",
            "BYTESCLF:response.body.bytes",
            "BYTES:response.body.bytes",
            "HTTP.USERAGENT:request.user-agent.last",
            "HTTP.COOKIES:request.cookies.last",
            "HTTP.COOKIE:request.cookies.last.*",
            "STRING:request.status.last",
            "HTTP.USERAGENT:request.user-agent",
            "STRING:connection.client.user.last",
            "HTTP.FIRSTLINE:request.firstline.original",
            "HTTP.METHOD:request.firstline.original.method",
            "HTTP.URI:request.firstline.original.uri",
            "HTTP.PROTOCOL:request.firstline.original.uri.protocol",
            "HTTP.USERINFO:request.firstline.original.uri.userinfo",
            "HTTP.HOST:request.firstline.original.uri.host",
            "HTTP.PORT:request.firstline.original.uri.port",
            "HTTP.PATH:request.firstline.original.uri.path",
            "HTTP.QUERYSTRING:request.firstline.original.uri.query",
            "STRING:request.firstline.original.uri.query.*",
            "HTTP.REF:request.firstline.original.uri.ref",
            "HTTP.PROTOCOL_VERSION:request.firstline.original.protocol",
            "HTTP.PROTOCOL:request.firstline.original.protocol",
            "HTTP.PROTOCOL.VERSION:request.firstline.original.protocol.version",
            "BYTESCLF:response.body.bytes.last",
            "BYTES:response.body.bytes.last",
            "NUMBER:connection.client.logname.last",
            "HTTP.FIRSTLINE:request.firstline",
            "HTTP.METHOD:request.firstline.method",
            "HTTP.URI:request.firstline.uri",
            "HTTP.PROTOCOL:request.firstline.uri.protocol",
            "HTTP.USERINFO:request.firstline.uri.userinfo",
            "HTTP.HOST:request.firstline.uri.host",
            "HTTP.PORT:request.firstline.uri.port",
            "HTTP.PATH:request.firstline.uri.path",
            "HTTP.QUERYSTRING:request.firstline.uri.query",
            "STRING:request.firstline.uri.query.*",
            "HTTP.REF:request.firstline.uri.ref",
            "HTTP.PROTOCOL_VERSION:request.firstline.protocol",
            "HTTP.PROTOCOL:request.firstline.protocol",
            "HTTP.PROTOCOL.VERSION:request.firstline.protocol.version",
            "HTTP.COOKIES:request.cookies",
            "HTTP.COOKIE:request.cookies.*",
            "BYTES:response.body.bytesclf",
            "BYTESCLF:response.body.bytesclf",
            "IP:connection.client.host",
    })
    public void setValue(final String name, final String value) {
        results.put(name, value);
    }

    public String toString() {
        StringBuilder sb = new StringBuilder();
        TreeSet keys = new TreeSet(results.keySet());
        for (String key : keys) {
            sb.append(key).append(" = ").append(results.get(key)).append('\n');
        }

        return sb.toString();
    }

    public void clear() {
        results.clear();
    }
}

核心代码实现:

public class Test {
    public static void main(String[] args) throws MissingDissectorsException, NoSuchMethodException, DissectionFailure, InvalidDissectorException {
        new Test().run();
    }

    String logformat = "%u %h %l %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\" \"%{Addr}i\"";
    String logline = "2001-980:91c0:1:8d31:a232:25e5:85d 222.68.172.190 - [05/Sep/2010:11:27:50 +0200] \"GET /images/my.jpg HTTP/1.1\" 404 23617 \"http://www.angularjs.cn/A00n\" \"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; nl-nl) AppleWebKit/533.17.8 (KHTML, like Gecko) Version/5.0.1 Safari/533.17.8\" \"jquery-ui-theme=Eggplant; BuI=SomeThing; Apache=127.0.0.1.1351111543699529\" \"beijingshi\"";

    private void run() throws InvalidDissectorException, MissingDissectorsException, NoSuchMethodException, DissectionFailure {
        //打印日志格式的参数列表
        printAllPossibles(logformat);

        //将日志参数映射成对象
        Parser parser = new HttpdLoglineParser(MyRecord.class, logformat);

        parser.addParseTarget("setConnectionClientUser", "STRING:connection.client.user");
        parser.addParseTarget("setConnectionClientHost", "IP:connection.client.host");
        parser.addParseTarget("setRequestReceiveTime", "TIME.STAMP:request.receive.time");
        parser.addParseTarget("setMethod", "HTTP.METHOD:request.firstline.method");
        parser.addParseTarget("setRequestStatus", "STRING:request.status.last");
        parser.addParseTarget("setScreenResolution", "HTTP.URI:request.firstline.uri");
        parser.addParseTarget("setResponseBodyBytes", "BYTES:response.body.bytes");
        parser.addParseTarget("setReferrer", "HTTP.URI:request.referer");
        parser.addParseTarget("setUseragent", "HTTP.USERAGENT:request.user-agent");

        MyRecord record = new MyRecord(); 		
        System.out.println("==============================================");
        parser.parse(record, logline);
        LOG.info(record.toString());
        System.out.println(record.toString());
        System.out.println("==============================================");
        System.out.println(record.getConnectionClientUser());
        System.out.println(record.getConnectionClientHost());
        System.out.println(record.getRequestReceiveTime());
        System.out.println(record.getMethod());
        System.out.println(record.getScreenResolution());
        System.out.println(record.getRequestStatus());
        System.out.println(record.getResponseBodyBytes());
        System.out.println(record.getReferrer());
        System.out.println(record.getUseragent());
    }

    //打印所有参数
    private void printAllPossibles(String logformat) throws NoSuchMethodException, MissingDissectorsException, InvalidDissectorException {
        Parser dummyParser = new HttpdLoglineParser(Object.class, logformat);
        List possiblePaths = dummyParser.getPossiblePaths();
        dummyParser.addParseTarget(String.class.getMethod("indexOf", String.class), possiblePaths);
        System.out.println("==================================");
        System.out.println("Possible output:");
        for (String path : possiblePaths) {
            System.out.println(path + "     " + dummyParser.getCasts(path));
        }
        System.out.println("==================================");
    }
}
关注
打赏
1587549273
查看更多评论
立即登录/注册

微信扫码登录

0.0418s