[Tockit-general] Docco-0.3

Discussion:

Bobby

2004-04-13 13:26:02 UTC

Thank you for this neat program! I have been researching the best method
of implementing this kind of application, but it looks like I no longer
need to worry about that. My last idea was to adapt a web crawler to a
local application.

Docco does just about everything I need. I collect news and other
information for use in my own work as a writer and publisher of a
website. I need to access a large amount of this form of data, way
beyond the capability of any file browser, and yet that's all I could
find until now.

I downloaded the program and the two available plugins (pdfbox and poi).
I had the .bat extension associated with my text editor (Crimson), so
when I tried to run it, it just opened instead in the editor. Once I
fixed that, the program launched and ran as advertised.

However I do have one small problem that you may be able to offer some
advice on. When I try to open a file from the tree list, I get a DOS
window which immediately closes, and the expected application never
opens at all.

The directory I indexed contains mostly .html with some .pdf however, in
the tree list, all files have the same Windows default icon, which might
be the best clue. For some reason, the file type isn't being recognized.

I'm still running Windows 98SE which is likely the real problem. I've
been trying to migrate to Linux but haven't succeeded yet.

Do you have anything to offer on this?

Perhaps an idea for further development, would be a small "thumbnail"
like viewer that would show the first paragraph for example, as you
select a document from the tree. Or, alternatively a scrollable list of
such sampled text from the documents in the tree list. Locating a
particular document in this way would be much faster than loading each
document one by one into an application.

Regards,
Bobby Garner

Peter Becker

2004-04-13 17:17:09 UTC

Permalink

Hi Bobby,

Post by Bobby
Thank you for this neat program! I have been researching the best
method of implementing this kind of application, but it looks like I
no longer need to worry about that. My last idea was to adapt a web
crawler to a local application.
Docco does just about everything I need. I collect news and other
information for use in my own work as a writer and publisher of a
website. I need to access a large amount of this form of data, way
beyond the capability of any file browser, and yet that's all I could
find until now.

Thanks for the praise :-)

Post by Bobby
I downloaded the program and the two available plugins (pdfbox and
poi). I had the .bat extension associated with my text editor
(Crimson), so when I tried to run it, it just opened instead in the
editor. Once I fixed that, the program launched and ran as advertised.
However I do have one small problem that you may be able to offer some
advice on. When I try to open a file from the tree list, I get a DOS
window which immediately closes, and the expected application never
opens at all.
The directory I indexed contains mostly .html with some .pdf however,
in the tree list, all files have the same Windows default icon, which
might be the best clue. For some reason, the file type isn't being
recognized.
I'm still running Windows 98SE which is likely the real problem. I've
been trying to migrate to Linux but haven't succeeded yet.
Do you have anything to offer on this?

I just browsed for some information on this. We use a little Java class
called Browserlauncher, which does not seem to be maintained anymore.
But some people reported that problem you describe on Win9x/ME in the
bug tracker (it is a project on Sourceforge). I tried to apply that
patch but since I long left DOS-based Windows, I can't test it.

Can you download this:
http://kvo.itee.uq.edu.au/~pbecker/ToscanaJ.jar

and replace the ToscanaJ.jar in Docco's libs with that? Does that help?

Post by Bobby
Perhaps an idea for further development, would be a small "thumbnail"
like viewer that would show the first paragraph for example, as you
select a document from the tree. Or, alternatively a scrollable list
of such sampled text from the documents in the tree list. Locating a
particular document in this way would be much faster than loading each
document one by one into an application.

We can think about that. It implies either storing this information in
the index (which normally does store which words are contained in a
document, but not their order) or re-reading the document again. But
that wouldn't be too hard and I agree that this could be a handy little
feature.

Peter

Peter Becker

2004-04-13 22:45:04 UTC

Permalink

"start" should be the program executed by the command line interpreter
(which I think is 16bit). The long file names might be a problem, but
don't have to be -- it all depends on the "start" command. At the moment
I can't do much but guessing, it seems that the code we use does work
for other people and since I don't have access to any DOS-based Windows
I can't test myself. You'll probably have to live with that problem,
unless I get some inspiration or someone else has an idea or is willing
to write a little C/C++ tool with JNI wrapper that does the trick. It
would just call some WinAPI function (I know the is one, but I am not
sure about the name, could be ShellExecute), but it would need the
proper setup to create such a thing, which I don't have at the moment.
The most tricky bit would probably be loading it on Windows and only on
Windows.

Another thing I might do is supporting drag&drop as well as copy&paste.
The former would allow dragging the file URLs into other programs, the
latter could e.g. be used to copy the ULR into Windows Explorer, which
might cause the shell execute. Similar with the Run-Dialog (Win+R). No
timeframe for this, though.

Peter

Non of those commands work. As you may know, Windows doesn't have a
terminal program. I think command.com is still a 16 bit DOS
interpreter. It doesn't recognize long filenames, and I'm pretty sure
there are no file associations. In other words, you can only use
'command' with an executable.
In Windows I do have the file association set correctly so that if I
click on a file in my file browser, it opens in the application. I
just verified that with html and pdf.
What time is it there?
Bobby
------------------------------------------------------------------------

Peter,
I clicked on the link and the file loaded in my browser bypassing my
download manager. I found your index page and the same thing
happened there. I wasn't quite sure what to do with it but I used
the "save as" option and overwrote the existing file. I was affraid
that might corrupt the file, but maybe not. I'm running Mozilla
Firebird.

"Save as..." should be ok. Mozilla tends to do that with certain
files, I think it relies too much on the server configuration and
ours probably doesn't know .jars are binary.

I don't see any difference when the program launches, and when I
click on a file, it's the same as before, a DOS window opens with no
content then closes.

Can you try to open a command line window and run some of these
commands on one of the files you want to open (FILE = path to file,
command /c start "" FILE
command /c start FILE
start FILE
Which of these command do work for you?

I have the Jave Development kit (j2sdk1.4.1_02). I don't have much
experience, but I'm really interested in it, and I'd be happy to
tinker with it if you think there might be something I could do.
My only programming experience is with Quickbasic. I bought it just
before they announced Visual Basic. I've studied C and C++, and have
Borlands free command line tools, and Dev-CPP IDE. In addition, I
have eclipse sdk 2.1 which seems really top heavy to me. Also the
Fox Tool Kit.
I have collected all this software to try to figure out what I want
to put my time into, so I don't repeat the QB experience.
As you might surmise, I get along without programming, but as an
practicing electrical engineer, it sure is handy at times.

If you really want to go into Java development I'd recommend using a
full IDE -- you definitely want method expansion, the refactoring
facilities and other goodies. Eclipse is pretty good (I use 3.0M8 at
the moment, the milestones are usually stable), IDEA is better but
not for free. Some hints on how to get/use ToscanaJ code (which Docco
http://toscanaj.sourceforge.net/participate/index.html
If you want I can do a writeup of an Eclipse configuration. I had
that plan for a while.
If you just want a simple and easy to use language I'd recomment
going Python (http://python.org/). Java is powerful, but messy (esp.
some of the libraries are badly designed). Python is very neat and
since it is mainly a scripting language easy to get used to -- you
can always test things in the scripting enviroment.
HTH,
Peter

Bobby
------------------------------------------------------------------------

Post by Peter Becker
Hi Bobby,

Post by Bobby
Thank you for this neat program! I have been researching the best
method of implementing this kind of application, but it looks like
I no longer need to worry about that. My last idea was to adapt a
web crawler to a local application.
Docco does just about everything I need. I collect news and other
information for use in my own work as a writer and publisher of a
website. I need to access a large amount of this form of data, way
beyond the capability of any file browser, and yet that's all I
could find until now.

Thanks for the praise :-)

Post by Bobby
I downloaded the program and the two available plugins (pdfbox and
poi). I had the .bat extension associated with my text editor
(Crimson), so when I tried to run it, it just opened instead in
the editor. Once I fixed that, the program launched and ran as
advertised.
However I do have one small problem that you may be able to offer
some advice on. When I try to open a file from the tree list, I
get a DOS window which immediately closes, and the expected
application never opens at all.
The directory I indexed contains mostly .html with some .pdf
however, in the tree list, all files have the same Windows default
icon, which might be the best clue. For some reason, the file type
isn't being recognized.
I'm still running Windows 98SE which is likely the real problem.
I've been trying to migrate to Linux but haven't succeeded yet.
Do you have anything to offer on this?

I just browsed for some information on this. We use a little Java
class called Browserlauncher, which does not seem to be maintained
anymore. But some people reported that problem you describe on
Win9x/ME in the bug tracker (it is a project on Sourceforge). I
tried to apply that patch but since I long left DOS-based Windows,
I can't test it.
http://kvo.itee.uq.edu.au/~pbecker/ToscanaJ.jar
and replace the ToscanaJ.jar in Docco's libs with that? Does that help?

Post by Bobby
Perhaps an idea for further development, would be a small
"thumbnail" like viewer that would show the first paragraph for
example, as you select a document from the tree. Or, alternatively
a scrollable list of such sampled text from the documents in the
tree list. Locating a particular document in this way would be
much faster than loading each document one by one into an
application.

We can think about that. It implies either storing this information
in the index (which normally does store which words are contained
in a document, but not their order) or re-reading the document
again. But that wouldn't be too hard and I agree that this could be
a handy little feature.
Peter