Parsing man pages

Member
Posts: 70
Joined: 2004.06
Post: #1
Firstly, I must apologize for posting this here as opposed to iDA, unfortunately I can't seem to register there so for now I'll just have to plague iDG with non-game related posts. Very sorry.

Now for the problem: I want to be able to read in a man page and display it in a NSTextView with nice and pretty formatting, but it's proving to be somewhat more difficult than I expected initially. As you'd expect.

It would appear that man pages are formatted using one of the many *roff variations, which in itself is not too big a deal (although I must say, *roff formatting is pretty disgusting to look at). I could have simply written something that parsed the various tags and left it at that.

However, depending on whether the page is in a cat or a man directory, it uses a different style of formatting. This is not good, as it would mean I have to write two interpreters (I was not impressed with the prospect of having to write one).

The lazy-Sam solution is just to run system("man [page of interest]"); and chuck the result of that into my text view, but that doesn't allow me to do all the fancy formatting I would like to, and quite possibly won't even preserve the standard formatting done by man.

Bugger.

So, does anyone know how I should go about formatting the pages? I am sure that I will have to write my own interpreter if I want to do my own formatting (for the NSTextView), as any pre-written programs certainly won't do it for me. However, I'm having difficulty finding information on the exact format used for man pages, as all google searches just return pages and pages of man pages on man.

Of course, it's entirely possible that since I know nothing what-so-ever about *roff formats that I'm completely missing the whole thing and writing an interpreter is a fairly trivial (if time-consuming) process. I'm hoping someone somewhat more in-the-know than myself can enlighten me here.

So with no further ado, let the enlightenment begin!
Quote this message in a reply
Member
Posts: 196
Joined: 2003.10
Post: #2
Why not throw the contents of system("man $appname") into a string variable or temporary text file, and then format the result? Or do you want more control?

The source code to "man" must be available. If not, I'm sure someone's done this before. Perhaps you can find a perl or python parser that's already been written?

I'm guessing this is for a dashboard widget?
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #3
there is a groff command-line tool, which can output postscript, which can be passed to pstopdf, which can be displayed by CoreGraphics or PDFKit.
Quote this message in a reply
Member
Posts: 70
Joined: 2004.06
Post: #4
Keith, are there any particular options I need to specify for groff to output proper PostScript data? I tried just running groff on a normal man page and piping that into pstopdf, however the result was weird weird weird. The text went off the top of the page, it looked as if lot of words had simply been dropped from the final page, and everything was just one big paragraph. Not quite as pretty as I'd hoped for :-p

I had a look at the man page for groff, but I didn't see anything that looked like it might be related to PostScript output. The pstopdf man page had absolutely nothing of interest, either.

Am I missing something?
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #5
Code:
cat /usr/share/man/man3/glTexImage2D.3 | groff -Tps -mandoc -c | pstopdf -i -o ~/Desktop/glTexImage2D.pdf

extracted from the man pages for groff, man and pstopdf Rasp
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #6
BTW, Xcode already does a decent job of this... it's in the help menu.
Quote this message in a reply
Sage
Posts: 1,199
Joined: 2004.10
Post: #7
Also, the app ManOpen has been doing a good job of this since -- I gather -- the NextStep days. Except it's broken for me on Tiger...

Also, Sogudi does a good job via Safari -- just type man:whatever
Quote this message in a reply
Post Reply