Re: shared-mime-info: distinguish x-octave and x-matlab files

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shared-mime-info: distinguish x-octave and x-matlab files

From:	Carnë Draug
Subject:	Re: shared-mime-info: distinguish x-octave and x-matlab files
Date:	Fri, 13 Feb 2015 17:07:51 +0000

On 12 February 2015 at 19:29, Mike Miller <address@hidden> wrote:
> Hey, I think this is an interesting project and appreciate you taking it
> on with the freedesktop folks.
>
> On Wed, Jan 28, 2015 at 13:01:34 +0000, Carnë Draug wrote:
>> Hi
>>
>> currently, shared-mime-info [1] treats x-octave as an alias of x-matlab.
>> It does this despite using "##" as magic to identify the file as Matlab.
>> The most obvious problem of this is with text editors picking Matlab as
>> syntax highlight for Octave code.  My take on the two mime types, is that
>> having the file written in Octave does not matter, what matters where it
>> can run.  This means that a file written in Octave that is Matlab compatible
>> deserves a x-matlab mime.
>
> Agree so far.
>
>> I submitted them a patch that splits them into two types and added magic
>> to recognize a shebang line for an octave interpreter [2].  In addition,
>> it also replaces the common "##" magic with "#" for Octave and "%" for
>> Matlab.  However this last got rejected since this seems to be too small
>> of magic for their new standards.  Can anyone suggest new magic that
>> would help in distinguish between code that is meant to be Octave only
>> and Matlab compatible?
>
> I honestly don't know if this is possible. As much as I'd also like to
> see an Octave MIME type stand on its own, I don't know how it can be
> automatically detected like this in a small way that would be acceptable
> to the freedesktop mime database (I assume they don't want to be parsing
> full files extensively just to derive the MIME type).

You are right, they do not.  At the moment, they will refuse patches to
search more than 256 bytes into the file (though I guess it may be ok for
an exact match at a defined file location).

> I tried thinking
> of some strings that could be used, but strings that may distinguish
> Octave files are really just conventions that we use. As we like to say,
> the Matlab language is a subset of the Octave language. So a Matlab .m
> file is also an Octave .m file. The only thing that makes an Octave .m
> file not a Matlab .m file are specific coding styles, syntaxes, and
> function calls used.
>
> Put another way, it's like trying to differentiate between a C++98 .cc
> file and a C++11 .cc file. There are any number of things you could try
> to match on to detect whether it's a C++11 file.
>
> There are any number of things you could search for in an Octave .m file
> ("## -*- texinfo -*-", "%!test", or "print_usage") but the lack of any
> of them does not mean it's not an Octave .m file. If I take a Matlab .m
> file and add a broadcasting operation, or double-quoted strings, that
> makes it an Octave .m file.

Yes, that is true.  But I think that if you want an Octave file, you
would probably have more than just that difference.  You would probably
have used # for comments before, and ! for NOT, because the only reason
to not do so, is when writing a Matlab file.

Unfortunately, I also can't think of other things.  I think that looking
for "#" and "%" magic to the top of the file would be the way to go.
It is possible to have low weight on that magic so that it is pretty much
only used when trying to distinguish between files that have the ".m"
file extension (file extension has a lot more weight).

Would be nice to have some comments on the freedesktop bug tracker too.

>> Another thing I would suggest is to start adding a shebang line to our
>> Octave files.  This has two advantages, 1) make it easier for such
>> applications to recognize the file as Octave source, and 2) raise awareness
>> that it is possible to actually write an Octave program (I think this is a
>> really nice feature of Octave that it is not given enough attention).
>
> I don't think this is a good idea. Some tools may interpret the presence
> of a shebang line as an executable script, whether the file is
> executable or not (thinking of systems like Windows where there is no
> execute bit), some Linux packaging tools may parse shebang lines and
> handle the files differently, etc.
>

Is this really a problem? I thought windows systems simply ignore shebang
lines.  And in languages such as perl, the shebang line is pretty standard
even for modules only.  The only thing I know some packaging tools may do,
is edit shebang lines to point to the interpreter path that was used
during installation (which kinda makes sense).

> As much as I'd like this to go somewhere, I think it is probably best
> handled as more extensive checks in each editor or IDE where possible,
> or as a user preference where it needs to be.
>

I think the idea is for text editors and IDEs to rely on the desktop to
identify the mime type.  At least that's how gedit does it.  While
shared-mime-info provides a default dabatase and can be user configured,
with extra magic, it is not very clear how to do it.  This is why I would
like to have the default database better at guess x-octave files.

Carnë

[Prev in Thread]

Current Thread

[Next in Thread]

Re: shared-mime-info: distinguish x-octave and x-matlab files, Mike Miller, 2015/02/12
- Re: shared-mime-info: distinguish x-octave and x-matlab files, Carnë Draug <=

Prev by Date: Re: Offscreen rendering with OSMesa and gl2ps
Next by Date: OSMesa for MXE Octave, help wanted
Previous by thread: Re: shared-mime-info: distinguish x-octave and x-matlab files
Next by thread: important for OF maintainer - ismatrix backwards incompatible change in octave 4.0
Index(es):
- Date
- Thread