Article 3457 of comp.lang.perl:
Xref: feenix.metronet.com comp.lang.perl:3457
Newsgroups: comp.lang.perl
Path: feenix.metronet.com!news.utdallas.edu!tamsun.tamu.edu!cs.utexas.edu!math.ohio-state.edu!cyber1.cyberstore.ca!van-bc!vanbc.wimsey.com!cs.ubc.ca!uw-beaver!fluke!inc
From: inc@tc.fluke.COM (Gary Benson)
Subject: Re: Wordperfect to ASCII conversion with perl?
Message-ID: <1993Jun16.145908.4191@tc.fluke.COM>
Keywords: wordperfect
Organization: John Fluke Mfg. Co., Inc., Everett, WA
References: <C8KErG.5IC@murdoch.acc.Virginia.EDU>
Date: Wed, 16 Jun 1993 14:59:08 GMT
Lines: 57

In article <C8KErG.5IC@murdoch.acc.Virginia.EDU> jpw@sansfoy.lib.Virginia.EDU (John Price-Wilkin) writes:

>We have more than 1500 collection-specific guides to portions of our  
>Library's Rare Books and Special Collections in WordPerfect and would
>like to make ASCII versions available on our gopher.  Would it possible to  
>use perl to remove the WP header information and reformat paragraphs to  
>groups of 70 char lines?

Aboslutely this would be possible; and not just that, but highly desirable!
Perl is ideally suited to this kind of business, and if others haven't
already urged you to do so, let me be the first.

I use perl all the time to whip up little tools for translating from SAF
(Some Arbitrary Format) to real true ASCII text. 

>Aside from things like CTRL-Z, form feeds, and diacritics, are there other
>coding concerns?  Any help on this would be appreciated.  Right now we're
>reduced to scrapping together spare moments for the conversion, and the
>first few hundred have been tedious.

I can imagine. I'd like to help with more than just encouragement, but I'll
have to leave this one to a WordPerfect afficianado. All I know about the
product is that it uses 8-bit ASCII, at least for the special German
characters with umlauts, and the eszet. Here is a tiny piece of perl that
will convert these to common English transliterations:

# German substitutions - 8-bit WordPerfect ascii to common sequences

while (<>) {    
    s/\201/ue/g;                  # u-umlaut
    s/\204/ae/g;                  # a-umlaut
    s/\204/oe/g;                  # o-umlaut
    s/\232/Ue/g;                  # U-umlaut
    s/\341/ss/g;                  # eszet
    print;
           }


>John Price-Wilkin
>jpw@virginia.edu
>jpw@sansfoy.lib.virginia.edu (NeXTMail)

The hardest part is simply defining the parameters -- once you do that, perl
will scream through those files and dump out ASCII faster than scat!

ps: I seem to recall someone saying that WordPerfect uses control codes to
indicate bolding, centering, underlining and so on. If that is indeed the
case, you have to decide how your perl program will handle these things,
since only centering can be properly represented in ASCII... Please post
your results! Manyu others need this, I am sure!



-- 
Gary Benson-_-_-_-_-_-_-_-_-_-inc@sisu.fluke.com_-_-_-_-_-_-_-_-_-_-_-_-_-_-

Freedom is just chaos with better lighting.    -Alan Dean Foster 


Article 3592 of comp.lang.perl:
Xref: feenix.metronet.com comp.lang.perl:3592
Path: feenix.metronet.com!news.ecn.bgu.edu!wupost!cs.utexas.edu!uunet!psgrain!ee.und.ac.za!tplinfm
From: barrett@lucy.ee.und.ac.za (Alan Barrett)
Newsgroups: comp.lang.perl
Subject: Re: Wordperfect to ASCII conversion with perl?
Date: 21 Jun 1993 10:35:08 +0200
Organization: Elec. Eng., Univ. Natal, Durban, S. Africa
Lines: 21
Message-ID: <203rrs$8n8@lucy.ee.und.ac.za>
References: <C8KErG.5IC@murdoch.acc.Virginia.EDU>
NNTP-Posting-Host: lucy.ee.und.ac.za

In article <C8KErG.5IC@murdoch.acc.Virginia.EDU> jpw@sansfoy.lib.Virginia.EDU  
(John Price-Wilkin) writes:
> We have more than 1500 collection-specific guides to portions of our  
> Library's Rare Books and Special Collections in WordPerfect and would
> like to make ASCII versions available on our gopher.  Would it possible to  
> use perl to remove the WP header information and reformat paragraphs to  
> groups of 70 char lines?  Aside from things like CTRL-Z, form feeds, and  
> diacritics, are there other coding concerns?  

Try wp2x (from comp.sources.misc volume 22).  It purports to be able to
convert from WordPerfect to several other formats, but I have an idea
that it was restricted to WordPerfect version 4.x.  There might be a
newer version somewhere.

Or you could try wp2latex (ask archie).  It does a passable job of
converting WordPerfect version 5.1 documents to LaTeX format.

If you don't get usable output from either of these, you should at
least get some ideas on how to decode the WordPerfect format.

--apb (Alan Barrett)