Headers Analysis Tool
Header Analysis Tool is a part of LSB Library Import Tools and can be used as alternative way of data extraction for libtodb2 (instead of using readelf and debug information).
Development version of the tool can be obtained from LSB Bazaar, headertodb2 subfolder.
In many cases headertodb2.pl usage is rather straightforward - see examples below. However, the tool has some options to make the data collection process more adjustable.
USAGE ./headertodb2.pl [option...] [header...] OPTIONS -a, --all Get all kind of data (Equivalent to specifying -m, -t, and -i). --debug Print some debugging information. --dir <dir_name> Look for requested headers in the <dir_name> directory. This option has no effect if --header-dir option os specified. --dump-tree Dump text representation of trees obtained from gcc parser for headers being processed. --cpp-cl <options> String with additional options which will be passed to cpp parser 'as is'. --cpp-keywords Replace C++ keywords with temporary names for correct parsing of C and then replace them back. --first-define Keep first define. By default it's considered as header reinclude protected and isn't parsed. --func-list <file> File with functions exported by library (list of such functions should be obtained on the first stage of the upload process, using componenttodb.pl and dump_interfaces.pl scripts). Every line of the file should contain function name and version, separated by semicolon. If this option is not specified, then information about all functions from given headers is extracted. Global variables should be also present in the list, if any. -I, --inc_dir <dir_name> Add directory dir to the list of directories to be explored when looking for headers (<dir_name> is actually passed to gcc using its '-I' option). --intrinsic <file> Update intrinsics from <file>. Every line of the file should have the following format: 'name of intrinsic in g++ tree;name of intrinsic in LSB DB'. If this option is not specified, default intrinsics are used. -i, --get_info Get information from g++ tree and put it into text files. -m, --get_macros Get macroses from headers and put into text files. -t, --get_tree Run g++ parser on headers and init tree. --gcc-cl <options> Extra command line options for g++. (The same as --cpp-cl) --gcc-path <dir> Path to directory with hacked or not standard g++. -g, --group Include all headers into one header and process it. -H, --header <name> Add header <name> to the list of headers which should be analyzed (in addition to headers specified using other options). -d, --header-dir <dir> Directory with target headers. All headers found in this directory will be processed. Note that subdirectories are also parsed. -l, --header-list <file> List of target headers. -h, --help Print this help and exit. --keep-temp Keep all temporary files (usefull for debugging). --lib <name> Name of the processed library. --no-arch Don't make analysis of macros for different architectures. -o <name> Absolute path to folder where all temporary and result files will be located. If this option is not specified then all output is written to current working directory. --path <name> Prefix for all headers' names of the processed library. Note that if you use -d option then subfolders' names are automatically added to a prefix. -q, --quiet Don't print any information messages.
libsane header files are located in the /usr/include/sane folder. Let's suppose we need all headers from that folder (all the more there are only two headers there).
The data collection process is very simple:
cd <path_to_headertodb2_directory> ./headertodb2.pl --lib libsane --path sane -a --header-dir /usr/include/sane/ ./prepare_sql.pl -l libsane >add_libsane.sql
The second line is the most important here; in addition to self-explaining '--lib' and '--header-dir' options, it contains '-a' option forcing headertodb2.pl to extract all possible information (functions, variables, types, etc.; in general, the tool can used to extract only part of this data). '--path' says that headers' names should contain 'sane' prefix.
Now we may try to execute the SQL created:
mysql lsb <add_libsane.sql
(provide mysql with options such as user, dbhost, etc., if needed).
Note: You need to setup a mysql database with the LSB data, see: Creating LSB database
Note: SQL created is id-independent in those sense that it doesn't give fixed identifiers to the nes entries. It uses the database to detect which entries should be actually added, and which ones are already present.
If everything goes fine, we may proceed with the `set_appearedin.pl` script described on LSB Library Import Tools page:
./set_appearedin.pl -l libsane -a -f -v 4.0 >mark_appeared.sql
Note: Currently this script will mark all necessary entries in the database as included, and at the same time it will print to console all actions performed (they are collected to mark_appeared.sql in our example). These actions also don't use fixed identifiers.
Note also that set_appearedin.pl in case of using headertodb2.pl will also process all constants and macors, so there is no need to call process_defines.pl from libtodb2.
At the moment we've uploaded all data necessary for headers/tests generation, except two high level records - we should assign the new library to some submodule and add ArchLib entry:
SET @Lid=(SELECT Lid FROM Library WHERE Lname='libsane'); SET @Mid=(SELECT SMid FROM SubModule WHERE SMname='<target_submodule_name>'); INSERT INTO SModLib VALUES( @Mid, @Lid, '4.0', NULL ); INSERT INTO ArchLib VALUES( @Lid, 1, 'libsane.so', '4.0', NULL );
And well, we should also add IntStd records for libsane interfaces, but it is a separate question. And as for data upload, we have finished, indeed - if there are no problems (SQL errors), we may proceed with generation of headers, stub libraries, tests, etc.
Note: As the result of our actions, we've obtained two SQL files - addlibsane.sql and mark_appeared.sql. These SQLs are id-independent and can be safely applied to any other copy of the database (though there can be possible collisions if we separately prepare SQLs for libraries which will add the same types or constants).
libcairo is a more tricky example. Its headers are located in /usr/include/cairo folder, and depend on freetype2 headers. headertodb2.pl uses gcc to obtain the data, and neither cairo nor freetype2 subfolders are included in the default include path of gcc. In addition, we actually don't need some headers, since they are too new - these are cairo-glitz.h, cairo-xcb.h and cairo-xcb-xrender.h.
The following line demonstrates the call to headertodb2.pl that should be performed to take into account these peculiarities:
./headertodb2.pl --lib libcairo --path cairo --header-dir /usr/include/cairo -a -I /usr/include/cairo -I /usr/include/freetype2 -N cairo-glitz.h -N cairo-xcb.h -N cairo-xcb-xrender.h
All other steps are the same as for libsane.
Let's also take a look at another useful example - let's add some functions to the library that is already present in the database - libXext. Let's process its X11/extensions/extutil.h header. The problem with this header is that it uses some types declared in Xlib.h and Xlibint.h, but don't include these headers themselves. Xlibint.h includes Xlib.h, so Xlibint.h would be enough. Surely, it is not nice to modify system headers; instead, headertodb2.pl has '--cpp-cl' option that allows to pass options to cpp preprocessor which will be taken into account when processing headers. In our case, we can call headertodb2.pl in the following way:
./headertodb2.pl --lib libXext --path X11/extensions -a --cpp-cl "-include /usr/include/X11/Xlibint.h" /usr/include/X11/extensions/extutil.h
When using this tool, one should remember that it is just a wrapper that calls gcc (or cpp) parser, gets some dumps from it and translate them to the form suitable for the tools that perform the actual upload to the database. Thus, if you see gcc errors when using this tool, simply imagine that you call gcc to compile a program which just include your headers, and then try to form gcc options which should satisfy the compiler. The tool has --cpp-cl option, whose value will be directly passed to the cpp parser (there is appropriate --gcc-cl option for gcc, but in general we recommend to use cpp parser, and it is used byfdefault).
We can mention the following situations when extra options are required:
- Some necessary elements are declared under '#if' or '#ifdef' directives, and appropriate condition are not met during analysis. This can be corrected by defining (or un-defining) necessary constants using -D (or -U) option.
- Some additional headers should be included before the target one, but you don't want to analyze those headers. In this case you can form appropriate --include options for cpp and pass this string using --cpp-cl to headertodb2.pl.
Note also that sometimes headers cannot be parsed one by one (default headertodb2 behaviour) due to complex or cross dependencies. In this case it can be useful to parse them 'all at once' - i.e. create a 'meta header' which will include all target headers (in the order given by user) and then parse such meta header with all included content. This behaviour can be turned on using '-g' option of headertodb2.pl.
Building the specification requires entries in the Standard and IntStd tables. The former either is a pointer to an upstream specification or a pointer to LSB maintained documentation for the library interfaces. For a new import to flow through the spec, one should try to create these entries. Here's an example for ncursesw, where some interfaces are documented upstream and others will require documentation to be created. There is no tool at this time to create this sql, although a fairly simple shell script can accomplish the task of filtering out the links/interface and creating the subsequent urls:
insert into Standard (Sname, Sfull, Surl, Stype, Sarch, Sshort, Sbaselink, Stag, Sappearedin) values ('Libncursesw', 'Libncursesw API', 'http://invisible-island.net/ncurses/man/ncurses.3x.html', 'Standard', 1, 'Libncursesw API', 'http://invisible-island.net/ncurses/man/', 'Libncursesw', '5.0'); set @ISsid=(last_insert_id()); insert into Standard (Sname, Sfull, Surl, Stype, Sarch, Sshort, Stag, Sappearedin) values ('ncursesw', 'Libncursesw Specification Placeholder', 'http://refspecs.linux-foundation.org/libncursesw/libncurses.html', 'Standard', 1, 'Libncursesw Placeholder', 'Libncursesw', '5.0'); set @ISsidp=(last_insert_id());
Alternately, you can omit the last 2 lines of SQL above and just assign interfaces with no docs to ISsid=10 below and omit ISrefspec.
SET @Iid=(SELECT Iid FROM Interface WHERE Iname='slk_attr_off' and Ilibrary='libncursesw'); INSERT INTO IntStd (ISiid, ISsid, ISappearedin, ISurl) VALUES (@Iid, @ISsid,'5.0','http://invisible-island.net/ncurses/man/curs_slk.3x.html'); SET @Iid=(SELECT Iid FROM Interface WHERE Iname='slk_attr_on' and Ilibrary='libncursesw'); INSERT INTO IntStd (ISiid, ISsid, ISappearedin, ISurl) VALUES (@Iid, @ISsid,'5.0','http://invisible-island.net/ncurses/man/curs_slk.3x.html'); SET @Iid=(SELECT Iid FROM Interface WHERE Iname='slk_attr' and Ilibrary='libncursesw');
Then for the interfaces with no upstream doc:
SET @Iid=(SELECT Iid FROM Interface WHERE Iname='add_wch' and Ilibrary='libncursesw'); INSERT INTO IntStd (ISiid, ISsid, ISrefspec, ISappearedin) VALUES (@Iid, 10, @ISsidp,'5.0');
There should also be entries in SModStd, linking the new Standard entry to the appropriate SubModule, and the LSB version:
set @SMid=(select SMid from SubModule where SMname='LSB_Core'; INSERT INTO `SModStd` VALUES (@SMid,@ISsid,'5.0',NULL);
Note: Currently when using -g, the tool will not process/include any "defines" (macros/constants).