This looks nice. What I'd really like to see, along these lines, is a python library for automated document metadata extraction with confidence assessment, like this:
I thought about the metadata thing but decided to exclude it for the earliest versions of textract to keep things simple. If you'd like to see it in there and have a good example of how you'd like to use metadata, please feel free to throw an issue on the issue tracker https://github.com/deanmalmgren/textract/issues/
As far as I have been able to tell, the public state of the art in academic paper metadata parsing is Grobid: https://github.com/kermitt2/grobid
Not quite as simple a commandline interface as you suggest, but not too hard to set up, and pretty impressive. Now if only Google Scholar would open-source whatever they use...
./autometa.py --author --verbose academic-paper.pdf
Author: "Edward Witten" Confidence: High (matches template "amslatex")