tag2find forum
May 22, 2012, 04:02:00 pm *
Welcome, Guest. Please login or register.

Login with username, password and session length
News: Guest postings temporarily disabled! - Due to SPAM-bots. Sorry! - Please register  (http://forum.tag2find.com/index.php?action=register) to post!
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: Storing tags as NTFS metadata?  (Read 7015 times)
MrCrispy
User
*
Offline Offline

Posts: 3


View Profile
« on: February 14, 2007, 10:53:50 pm »

I just found tag2find (through a post on donationcoder.com forums) and think its a wonderful idea! Like many others, I've been waiting patiently for file system tagging, and after WinFS was taken back out and cruelly murdered, there really wasn't much hope. So its nice to see something being done for home users in this area.

I've signed up for the beta but haven't received the invite yet, so these are just preliminary thoughts based on what I've read. My understanding is that tags are stored in your database and you probably link tags to files using filename+path as the unique id. NTFS change tracking is used to update the tag db if files are moved/deleted/renamed.

Have you considered storing the tags as ADS streams inside the files themselves? NTFS fully supports extensible metadat as I'm sure you know and many programs, such as mp3 and picture tagging (using IPTC/XMP) already store the tags inside objects themselves. I see a number of huge advantages in this approach over a separate db:-

- files contain their own metadata, so no work needs to be done to track them. More importantly, the tag db will never be in an inconsistent state. There actually would be no need for a tag db except to serve as a repository for tag clouds, tracking frequency, related tags etc.

- you could get rid of the initial drive scanning. Tags could be applied as and when needed.

- since  tagging is now in a standard format, its possible for other programs to view/expose the metadata. I would like to think that this is the ultimate goal of tagging - not as some addon to existing file organization methods but a fundamentally different metaphor.

However this can be a potential disadvantage if you want to keep the tag functionality proprietary.

- it would be trivial (well, a lot easier), to write an explorer namespace extension that allows viewing and editing tags. Much more powerful stuff can be done in this are - e.g. extensions that are virtual folders based on tags or tag combinations. This would be exactly like what Vista/Spotlight give us, except it wouldn't depend on an indexing service.

- utilities exist to import/export NTFS streams when moving files to a different filesystem, such as CDFS/Fat32. This would solve the problem of supporting external media as well as networked shares.

- Doing this on a lower level is fundamentally a better way because you are not adding another abstraction layer on top of the file system. Basically, I believe WinFS was intended to work in the same way. It also opens up future enhancements such as adding relationships in metadata, writing different kinds of metadata handlers (i.e. the tag could be a more complex object rather than just text, and could perform functions such as redirections).

Of course, there are some rather obvious disadvantages as well -

- the size of each file would increase. Although the total size increase shouldn't be much more than a central db.

- performance may be slower than your approach

- technically, its probably harder to implement. Again, I'm not sure of what you have now so I can't comment on this.

- It would be easy to reverse engineer and use for others. So if your  ultimate goal is to have a commercial version for enterprise use, this might not be the best idea Smiley

I'll stop rambling now! I'm just very excited by this idea and have been toying with writing something similar myself (as an NTFS file system driver) but never really got anywhere.

Let me know what you guys think.
Report to moderator   Logged
MrCrispy
User
*
Offline Offline

Posts: 3


View Profile
« Reply #1 on: February 14, 2007, 11:12:57 pm »

btw, if I got the details wrong (i.e. you ARE storing the tags in files), let me know.
Report to moderator   Logged
martin
Developer
tag2find developers
You like tagging?
***
Offline Offline

Posts: 344



View Profile
« Reply #2 on: February 15, 2007, 09:57:17 am »

Hi,

thank you very much about your very extended and detailed posting!

Without being able to go into too much detail about our implementation (sorry Wink), we are already using Alternate Data Streams for storing tags as a fallback, that's why we are requiring NTFS at the moment. We do not publish the format at the moment as it is subject to change (and not yet very sophisticated), but we will probably publish that sort of information to allow others to hook into the application at a later point of development.

The inital file system scanning you are referring to is not a step necessary for tagging but for file tracking.

Sorry for not being able to give you more details at the moment, but certain parts of our implementation have to remain proprietary (at least for now), as we'll need to earn money for a living one day. Hope you can understand this.

Quote

- utilities exist to import/export NTFS streams when moving files to a different filesystem, such as CDFS/Fat32. This would solve the problem of supporting external media as well as networked shares.


Could you elaborate a little bit on this? I confess its the first time I hear about such utilities and it would indeed be a great way of adding support for non-NTFS media.

Best regards,
Martin
Report to moderator   Logged

tag2find development team
Armando
Contributor
**
Offline Offline

Posts: 17


View Profile
« Reply #3 on: June 19, 2007, 05:36:10 am »

Regarding metadata, proprietary format, etc.

As you’ll probably agree, portability is usually an important aspect of computing, and it seems like a pretty legitimate expectation from a user to be able to keep some kind of control over his/her data, even if created with a piece software that’s proprietary. In that spirit, file identification or “tags data” (which generally represents many many hours of work) shouldn’t be just completely lost in the advent of file transfers (for example, transfers to: 1- somebody else’s computer, 2- another OS or file system, 3- just to another USB drive (to be able to work on another computer), to name a few…), or use of a different software.

So, without necessarily giving “too much detail about your tagging implementation” I’d like to know if the tag2find team has thought of some temporary solutions to cope with these fairly frequent situations involving file transfers, or OS/software changes?

One possibility, if I may suggest one  :wink: , would be to implement a function that would allow tag2find to directly append tags on demand to selected files, but in a way that’s readable, easily decodable and searchable (for example : with other desktop search software like locate32, Windows Desktop Search, Copernic, etc. Of course, these DS software wouldn’t be able make or edit the tagging, but they could at least read them.)

 Ways of doing that could be to :

1-   append the tags directly to the filenames, (ex : “tag2find.doc”, with the tags “tagging” and “software “ would become “tag2find +tagging +software.doc” or something like that. That would be a great way for certain situations with minimal tools. I would also ensure that most tags wouldn’t be lost (unless one is going to be using an older File system, with an 8, 31, 32, 64 or  128 characters limitation)

2-   append the tags directly to some metadata fields  (“keywords”, “comments”, “title”, etc. in the file summary, or IPTC, EXIF, JPEG Comments, ID3, or whatever.)

(or even 3-  export the list of tags and files to a file… That solution is, of course, the least practical.)


Then, if one needs to use his/her files on somebody else’s computer (or on another OS, etc.) and can’t use tag2find for some reasons, one can still have access to parts of the original categorization (even if only in a restricted way).

Without some kind of “safety net” (and even if I find tag2find already very very well done), it’s difficult for me to just “trust” the software, go ahead and tag away my 15000 documents (weeks of work…).

Best regards,

M.
Report to moderator   Logged
martin
Developer
tag2find developers
You like tagging?
***
Offline Offline

Posts: 344



View Profile
« Reply #4 on: June 19, 2007, 07:33:30 am »

Hi,

thanks for your very long posting. We take this issue very seriously. Portability is very important for us and we know we are a little bit behind our own schedule here at the moment.

We are definitely working in this direction, but at the moment we cannot directly support external media, especially if they are FAT32 (which most memory sticks are). We will continue work in this area however and will store tags for removable media somewhere on the removable media themselves.

At the moment, to prevent you from losing your hard work, we provide one basic backup possibility: export to a plain text XML file. The schema of the XML is very basic simple and will for sure prevent you from a "vendor lock in", which we understand nobody really wants. The backup has some downsides at the moment, as the files are stored with absolute path, but it will always allow recovery in case of a disaster, maybe requiring a little bit tweaking with a text editor in case the location of files has changed.

We do not really make a very big secret out of how our tags are stored: at the moment they are stored in two locations: a system-wide tag-database (SQLite) and attached as Alternate Data Stream to the file itself (which is the reason why we can only support NTFS at the moment). Tags can be recovered from backup or the NTFS Alternate Data Stream in case the central database corrupts (which is highly unlikely). Alternate Data Streams are copied together with a file by Windows Explorer, as long as the target volume supports them.

I hope this answer is what you expected. I can promise you we will improve/add support for removable media as soon as our development schedule permits. (I personally am really longing for this support, as I personally need it as well.)

Best regards,
Martin
Report to moderator   Logged

tag2find development team
Armando
Contributor
**
Offline Offline

Posts: 17


View Profile
« Reply #5 on: June 22, 2007, 08:32:47 pm »

Thanks for the follow up.
Yes, it answers my questions!
I saw other posts on portability, so I guess things are moving this way.
All the best,
M.
Report to moderator   Logged
Anonymous
Guest
« Reply #6 on: June 27, 2007, 04:22:06 pm »

I think that MrCrispy is bang on the money with his comment. The world is really crying out for an efficient tagging mechanism - and I believe yours is the best UI I have seen.

However there are many search tools and each person has their preferences - also the search vendors have the resources and rout to market that you will alway lack.

I know that my user base of scientists and information workers would jump at the chance of using your tagging system. In fact the VP's of Marketing from Thermo Informatics  and LabVantage - both big players in my field both jumped ship to form a company Ardenno to develop a similar solution.

Adding your tags to the NTFS metadata (even the underused Keyword property) would allow the search guys to do seach, the data management guys to do data management - leaving Tag2Find to do what it does best - Tagging!

If you concentrate on your core competence you have a very strong business value proposition.

I wish you luck - please drop me a line if this makes sense and you would consider co-marketing in the scientific information management arena.

All the best,

David

David Joyce
Product Manager
Thermo Fisher Scientific - Informatics.
Report to moderator   Logged
Anonymous
Guest
« Reply #7 on: June 27, 2007, 04:24:36 pm »

David

David Joyce
Product Manager
Thermo Fisher Scientific - Informatics.
My contact email..
David.Joyce at thermofisher dot com
Report to moderator   Logged
martin
Developer
tag2find developers
You like tagging?
***
Offline Offline

Posts: 344



View Profile
« Reply #8 on: June 28, 2007, 08:33:49 am »

Hi David,

thanks for your encouraging words. I am sure you will be contacted soon ;-)

Best regards,
Martin
Report to moderator   Logged

tag2find development team
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.15 | SMF © 2011, Simple Machines Valid XHTML 1.0! Valid CSS!