What is the purpose of XPort? What is "Annex F"?

What is the purpose of XPort? What is "Annex F"?

We are all quite accustomed by now to a two-step process to be followed when e-filing a US patent application. First, we use PASAT to add XML tags to the patent application. Then we use ePave to send the XML file to the Patent Office.

If you try to do this after installing ePave 5.1, then when you run ePave and try to attach the XML file that you created using PASAT, you will invariably receive an error message:

<file name> is not a "application-body" xml file.

This message raises more questions than it answers. What is an "application-body" xml file? Why is it so important to be an "application-body" xml file? What, exactly, am I supposed to do about the error message?

The third question, at least, is answered in the ePave documentation if you read through to page 190. It turns out that you run a program called XPort which takes your PASAT-generated xml file (which is not, apparently, an "application-body" file) and converts it into a file which will not yield the "application--body" error message and which will allow ePave to continue.

To understand what is going on here, it is necessary to provide a bit of background and history.

As you know, XML means "extensible markup language". By "extensible" is meant that anybody who wants to can make up any old tags to identify information within an XML file. The person who makes up some new tag is "extending" the markup language, and this is felt to be A Good Thing to be able to do such extending. In this way XML can be extended to provide a way to tag patent applications, trademark applications, mathematical formulas, chemical structure diagrams, or any of myriad other things.

Of course this benefits society only if the senders and receivers of XML files use the same tags to mean the same things. Professional and academic groups have gotten together to define, for example, the tags to be used throughout the mathematical profession to tag mathematical formulas and the tags to be used throughout the chemical profession to tag chemical-structure diagrams. Whenever anybody chooses some tag definitions, they memorialize them in a DTD (document type definition).

This means that when the US Patent Office decided several years ago to use XML for e-filing of patent applications, it chose to develop a DTD of its own, devising a DTD called "u-specif.dtd." In this DTD, for example, the title is tagged with "<title-of-invention>" and the figures are tagged with "<figures>". PASAT uses the u-specif.dtd file when it is used to add XML tags to a patent application. Most users of ePave 4.1 and PASAT may never have known about or paid any attention to the u-specif.dtd file, but every time you have ever e-filed a patent application using ePave 4.1, a copy of the u-specif.dtd file was placed in your submission folder, used by PASAT, and included among the files sent to the Patent Office by ePave.

The u-specif.dtd was designed specifically for US patent filings. WIPO has spent several years, however, trying to develop a DTD for patent applications that will (hopefully) work for patent applications all around the world. The DTD resulting from these efforts is called the "Annex F" DTD and you can read about it at http://www.wipo.int/pct-safe/en/index.htm . Where the u-specif DTD tags the title with a "title-of-invention" tag, the Annex F DTD tags it with "invention-title". Where the u-specif DTD tags the figures with a "figures" tag, the Annex F DTD tags it with "drawings". And so on through all of the other ways that two different groups of people might choose tags to tag various parts of a patent application.

Whenever you create an XML file, you are supposed to store information within the file to identify the DTD that was used. PASAT does this for you, so that you need not think about it, and what it stores is information indicating that "u-specif.dtd" was used to create the XML file.

It turns out that if you are using ePave 5.1 and if you try to attach a patent specification, ePave 5.1 checks to see which DTD was used. If it finds that the DTD that was used is "u-specif", then it gives you the error message that we talked about before:

<file name> is not a "application-body" xml file.

What ePave 5.1 is trying to tell you, though perhaps not as clearly as one would hope, is that it wishes you had used the Annex F DTD when you created your XML patent application. The way you would know this is if you already knew that "application-body" means "Annex F." It turns out that the name for the Annex F DTD is "application-body.dtd".

The ePave 5.1 user manual talks about this. It says:

When attempting to attach a specification produced by PASAT, ePave will return an error message indicating that the specification is not in accordance with the international filing standard.

By "international filing standard", the author of the ePave 5.1 manual means "Annex F" which in turn is "application-body.dtd".

You will recall that when you installed ePave 5.1 you also were asked to install a program called XPort. The situation I have described (two different DTDs created by two different groups of people) turns out to be the explanation for why you were asked to install XPort. When you run XPort, it converts your XML file from u-specif to application-body. For example when it finds "title-of-invention" in your XML file that you created with PASAT, it replaces it with "invention-title". When it finds "figures" it replaces it with "drawings". And so on until your XML file uses the "application-body tags instead of the u-specif tags. And it deletes the place where the XML file said that the u-specif DTD had been used, and inserts instead a statement that the application-body DTD was used. And the file is stored with a new file name that has "-trans" appended to the file name. If the file created using PASAT was called "spec.xml" then the output from XPort will be called "spec-trans.xml".

You can run the XPort program from within ePave by clicking a button called "Transform XML Document". A new window will appear which permits you to browse until you find the XML file that came from PASAT. XPort will then ask what type of transformation you want to perform. The correct choice is "specification to US-application-body". When the transformation is done, XPort will say "all paragraphs have been renumbered sequentially". I am not sure why this is interesting, given that PASAT already numbered all the paragraphs sequentially. For example I took a PASAT spec with 74 paragraphs (numbered 1 to 74) and used XPort to make it into an application-body file, and it still had 74 paragraphs numbered 1 to 74.

Recall that when we encountered the error message it was because we were trying to attach an XML specification. After we have run XPort we will have created a new XML specification (ending in "-trans") and we can ask ePave to attach that XML specification. ePave will look inside the XML file to see what DTD is mentioned and it will see "application-body" and it will accept the attachment.

When the ePave 5.1 documentation discusses this problem (the need to use XPort because of the change of DTDs) it repeatedly talks about "a specification produced by PASAT," a phrase that seems a bit awkward since I think every specification everybody ever produced for US e-filing was produced by PASAT. (Yes, there is the WordPerfect authoring tool but my impression is that nobody has ever actually used it.) So why does the documentation author use this phrase?

And a related question. When the Patent Office went to the trouble to develop and release a new ePave version that is tied to the new Annex F DTD, why did it not drop the other shoe by developing and releasing a new PASAT version that would use the new Annex F DTD (and, ideally, that would remedy various bugs that have been identified by users)?

I believe the answer is that the Patent Office is hoping that the five new partners ( http://www.uspto.gov/web/offices/com/speeches/02-48.htm ) will develop software that could replace PASAT/XPort. Such new software would of course follow the Annex F DTD and would hopefully generate not only the patent application XML file but perhaps also XML files for the other things that ePave does -- the application data sheet, the fee transmittal, the IDS, the assignment recordation cover sheet, and so on. Such software might be able to draw upon, for example, an "address book" of inventors to save tedious retyping of inventor information, etc.