Ignore Punctuation

Post your bug reports here.
Post Reply
JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Ignore Punctuation

Post by JPG »

Is it possible to ignore punctuation.
I have tried using the regex and replace but I am not sure it is working properly.
what would a regex look like to ignore punctuation like full stop and comma? I would expect this to work [,.] and it does in the below example. Just one line for each file.
However if I just have [,] or [.] the program marks the four punctuation marks as differences.
Can you please explain why this happens?

example
File 1
now, is the, time. for. all good men
File 2
now is the time for all good men



Jon

grigsoft
Site Admin
Posts: 1673
Joined: Tue Sep 23, 2003 7:37 pm
Contact:

Re: Ignore Punctuation

Post by grigsoft »

Well, this is the same problem that was with ignore case before. If you ignore all chars, lines are considered to be same and so are not highlighted at all. When only part of differences are ignored, lines are different, and so all changes are highlighted. This is by design, and require some more work to handle

JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Re: Ignore Punctuation

Post by JPG »

grigsoft wrote:Well, this is the same problem that was with ignore case before. If you ignore all chars, lines are considered to be same and so are not highlighted at all. When only part of differences are ignored, lines are different, and so all changes are highlighted. This is by design, and require some more work to handle
Sorry I don't see this regex function working like it did in a previous build. Please try the simple example I showed in the post above, one line for each file and try to ignore chars with a regex. build 2173 works like I would expect. Please examine the function still works.

Jon

JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Re: Ignore Punctuation

Post by JPG »

Hi Igor,
Have you made any progress on this issue?

Jon

grigsoft
Site Admin
Posts: 1673
Joined: Tue Sep 23, 2003 7:37 pm
Contact:

Re: Ignore Punctuation

Post by grigsoft »

I do have RegExp working in latest build without problems. You can ignore punctuation with [,.] as an example.

JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Re: Ignore Punctuation

Post by JPG »

grigsoft wrote:I do have RegExp working in latest build without problems. You can ignore punctuation with [,.] as an example.
Ok I seem to have that working now.
Thank you.

JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Re: Ignore Punctuation

Post by JPG »

JPG wrote:
grigsoft wrote:I do have RegExp working in latest build without problems. You can ignore punctuation with [,.] as an example.
Ok I seem to have that working now.
Thank you.
Hi Igor,
I have an issue I don't quite understand.
I am just now trying to ignore these chars with a regex [΄·᾿’•] and replace with nothing.
This is line compare and in the Basic rules the only options I have checked are Ignore case and Ignore Accented.

Why do I get the lines that don't show any changes, show as "changed", but there is no markup of any differences. Do you see what I mean.
This was something that you had fixed when using the options to ignore case and accents, and it still does work ok as long as I don't use the regex.

This is a very small example, but I think it shows an issue.
Compare Test.zip
Test file
(1.59 KiB) Downloaded 1598 times

grigsoft
Site Admin
Posts: 1673
Joined: Tue Sep 23, 2003 7:37 pm
Contact:

Re: Ignore Punctuation

Post by grigsoft »

I have tweaked it a bit more, here is an improved version. However, it turned out that RegExp engine I'm using could not handle correctly some Unicode characters in your RegExp. This requires more work.

JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Re: Ignore Punctuation

Post by JPG »

grigsoft wrote:I have tweaked it a bit more, here is an improved version. However, it turned out that RegExp engine I'm using could not handle correctly some Unicode characters in your RegExp. This requires more work.
Hi Igor, that is again an improvement thank you.

1. So what is the "flavor" of the regex engine?

2. Is there any hope that you can get it upgraded?

Jon

JPG
Posts: 109
Joined: Tue May 23, 2006 7:06 am

Re: Ignore Punctuation

Post by JPG »

JPG wrote:
grigsoft wrote:I have tweaked it a bit more, here is an improved version. However, it turned out that RegExp engine I'm using could not handle correctly some Unicode characters in your RegExp. This requires more work.
Hi Igor, that is again an improvement thank you.

1. So what is the "flavor" of the regex engine?

2. Is there any hope that you can get it upgraded?

Jon

Any progress in fixing this?

Post Reply