Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVG Groups are ignored. Especially transformations acting on the paths inside the group. #16

Open
alphanoob1337 opened this issue Mar 29, 2017 · 41 comments

Comments

@alphanoob1337
Copy link

Example in an SVG created in Inkscape:

<g
id="g3450-3"
transform="matrix(-1e-4,-1,1,-1e-4,670.77002,132.92999)">
<path
inkscape:connector-curvature="0"
d="m 28.35,0 c 0,15.65621 -12.69107,28.3485 -28.34728,28.35"
style="fill:none;stroke:#000000;stroke-width:0.72000003;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:10;stroke-opacity:1;stroke-dasharray:none"
id="path3452-8" />

The group transformation is ignored when calling svgpathtools.svg2paths().

@mathandy
Copy link
Owner

mathandy commented Mar 30, 2017

I'd like to add this functionality. I probably won't get to this in the near future though.

Here's how to do this (in case some helpful awesome person wants to make a pull-request):
You can quite easily take an affine transformation of lines, quadratics, and cubics by simply transforming the coefficients. E.g.,

p = parse_path("m 28.35,0 c 0,15.65621 -12.69107,28.3485 -28.34728,28.35")
A = np.array([[-1e-4, -1], [1, -1e-4]])
b = np.array([670.77002, 132.92999])

# these convert between real 2-vectors and complex numbers
xy = lambda z: np.array([z.real, z.imag])
z = lambda xy: xy[0] + 1j*xy[1]


transformed_path = Path(*[bpoints2bezier([z(np.dot(xy(pt),A) + b)
							for pt in seg.bpoints()]) 
								for seg in p])

This should work (I didn't test it, but just wanted to give the idea) for everything but paths containing Arc objects. Arc objects shouldn't be too hard to deal with either (see Arc.rotated() and Arc.translated()).
... it might be nice to just add a Path.transformed() method to deal with affine transformations in general.

@alphanoob1337
Copy link
Author

I took a look at the code by now. Yes. The transformation should be fairly easy to implement and I have started working on it. On Bézier curves, can I also just do a matrix transformation of the start, end and the two control nodes?

The issue is more that currently there is only a list of d strings created and then it is parsed. This way all information about the XML tree is lost. Because groups can be nested, the file has to be recursively parsed and transformed. If I am happy with my solution to the problem and it doesn't affect performance too much, I can do a pull request myself, if you want to have a look at it. It is still work in progress, though.

@mathandy
Copy link
Owner

mathandy commented Apr 1, 2017

On Bézier curves, can I also just do a matrix transformation of the start, end and the two control nodes?

That's exactly what the above code is doing.

The issue is more that currently there is only a list of d strings created and then it is parsed. This way all information about the XML tree is lost.

Keeping track of the group structure could be accomplished to some extent by simply adding an attribute to Path to keep track of any relevant information. I'm not sure what the desired behavior would be though when a Path is split into two, copied, tested for equality, etc.
Why retain the group structure? If all the path objects are transformed, then is it necessary to retain the groupings?

@alphanoob1337
Copy link
Author

OK. I included the code now with a few changes to keep it closer to this documentation https://developer.mozilla.org/de/docs/Web/SVG/Attribute/transform
Now I didn't want to retain the group structure. But to apply the transformations in the right order and to the right set of graphs you need to walk the tree instead of using minidom to get all the tags with a certain tag name.

My changes are ready for a pull request now. However I was not able to create a new branch. I get a Permission to mathandy/svgpathtools.git denied to alphanoob1337.

@mathandy
Copy link
Owner

mathandy commented Apr 6, 2017

@alphanoob1337 to make a pull request when you don't have commit permission you first need to create a fork (on your own repository), commit to that, then click the "create pull request" button on this (the svgpathtools) repository and you should see your own fork as a "compare" option.
changing-the-branch-range-and-destination-repository

alphanoob1337 added a commit to alphanoob1337/svgpathtools that referenced this issue Apr 7, 2017
Groups are now taken into account during parsing and transform attributes of groups are applied to the paths before they are returned.
@mathandy mathandy mentioned this issue Apr 27, 2017
@alphanoob1337
Copy link
Author

So let's give this a fresh start. I think we both agree by now, that it is not that simple. If you, @mathandy want to implement it in another way, that is totally fine. I just think it doesn't make sense for me to do it, if you already have very specific ideas about how you want it to be done. It will be more effort for you to tell me how you would like it to be implemented.

Although I am by no means a professional (I just code out of necessity for hobby and professional projects), I do care about coding style. Not about PEP8, but about thinking before implementing. So please let me explain my thoughts about the solution to this issue:

I was assuming that SVGs are typically composed the following way:

  1. There is usually another group and/or a path in a group. So empty groups are rare in general.
  2. Usually there is more than one path object in a group.

The file I had, had 200 paths in 86 groups nested up to four levels deep.

So for the sake of this argument, let's assume it has 200 paths in 62 groups in three layers. Then, in the first layer it would be 50 groups with 4 paths in each them, the second layer would be 10 groups of 5 groups of the first layer and the third layer would be 2 groups of 5 groups of the second layer.

I see two main ways of implementing the group transformations. Let me call them 'top-down approach' and 'bottom-up approach'. The top-down approach is what I preferred and implemented. It is recursively walking through the XML tree of the SVG and parsing the paths and groups along the way. I call this top-down because one goes from the root to the branch. In a hierarchy this is called 'top-down'. The bottom-up approach would be to walk through the XML tree once and identify all the path objects along the way. Then one loops over the parent nodes for each path to parse and apply the group transformations. This is parsing the groups starting from the leaves to the root. Hence 'bottom-up'.

As always, both approaches have their benefits: The top-down works best if the structure is deeply nested and if there are multiple paths in one group. However, empty groups are bad for performance. The bottom-up approach doesn't parse empty groups. Also it works well if there is one group per path. Therefore this approach is the better choice for shallow structures. One can also implement hybrid approaches, where the groups are parsed top-down and the paths independently of that. Then the transformation matrices are applied afterwards.

Apart from the thoughts about the best way to do it, there are also practical considerations, that play a role during the choice. The ones I see as most prominent are the following: In the current implementation the structure of the SVG file does not play a role so far. So it was perfectly valid, to extract the paths and work on them after they have been extracted from the XML tree. Since there were library functions available for the task, which can be assumed to be somewhat optimized, I agree that this was the best choice at that time. So to implement the top-down approach, one has to change existing code unnecessarily and the order in which the objects are returned by the function changes. The bottom-up had the clear disadvantage for me, that it was not the best approach for my example described above: in total, 600 group transformations would have had to be parsed and applied. In the top-down it is only 62 groups have to be parsed. Of course this is highly non-linear with the complexity of the file. In addition it allowed for all the other objects that are parsed into a path object, to be dealt with in the same walk through the tree. In principle there is only a small performance penalty for a hybrid approach. But there is a huge practical drawback: I was not able to come up with a good way to identify the individual path objects during the walk through the tree: paths usually have ID attributes, but you can not rely on this. The order in which minidom returns the objects in getElementsByTagName is not documented and therefore also unreliable (in the current version the elements are returned in order of appearance in the file). Of course one could say that it is unlikely that the order in which getElementsByTagName returns the nodes changes and rely on the current implementation. But still it would make the code more complex to read if two different approaches are mixed together.

Since for the svgpathtools the order in which the paths and converted objects are returned was not documented either and also was subject to change with minidom anyway, I decided this is undesirable but not a huge issue. The change of existing code was also undesirable to me, because it requires additional work, but in my opinion it was worth the hassle because it is the more elegant choice and it avoids walking the XML tree unnecessarily often and allows to directly cross-parse other nodes into paths in the same walk, which reduces the number of walks even compared to the current implementation without any groups. Also other complex features, like parsing the other group attributes and overwriting parts of the path attributes where needed, can be implemented in the same XML tree walk, if that is desired in the future.

Of course this is a question of style and therefore it comes down to personal opinions and preferences. Since you are maintaining the project it should be your choice. However it was not really clear to me from your comments, which approach you prefer. Is it the bottom-up or the hybrid one? I went ahead with the top-down approach, because it appeared to be the most logical one to me and my assumptions about SVG files. Also it complied best to my personal preferences, as I value simplicity, scalability and performance over keeping well-tested code. But as I said: it is fully up to you to decide which changes are unnecessary and which are not. I am just explaining here why I decided to implement it the way I did in my fork.

@mathandy
Copy link
Owner

mathandy commented May 3, 2017

Thanks for the explanation of your thinking @alphanoob1337.
There are many improvements I'd like to make to the svg2paths function and I'm not opposed to the top-down approach you describe. That said any drastic changes we make to well-tested code should come with thorough unittests.
You could have implemented this in a way that would have very little impact on current functionality, but you didn't. So how do I know the code still works for all cases that it worked for before? Am I just supposed to trust you? I could certainly add thorough unittests myself, I'd have to add unittests for the old and the new functionality since you replaced the old code, but I could do that. It would take a lot of my time though.

Does that make sense? If so, what would you suggest my next step be?

P.S. The recursive structure in your code is aesthetically an excellent choice, but I'm worried about its performance and scalability compared to other options. What do you think?

@mathandy
Copy link
Owner

mathandy commented May 8, 2017

@alphanoob1337 would you mind me using some of the code from your suggested fix to add support for groups? I opened a branch for this issue. If you make a new pull-request to the issue16 branch, I'll merge it and use your code as a base for the new functionality.

@alphanoob1337
Copy link
Author

Sorry for the slow response, but I am fairly busy at the moment. Yes, please use as much of my code, as you want to. I have a five hour train ride today, where I plan to clean up the PEP8 compatibility and create a pull request.

Yes. Python sometimes behaves funny in terms of performance. Appending the paths to a list in the loop as is my main concern at the moment since copying the list of paths around is very costly in terms of performance. However I was not really able to figure out a faster way to do this, but I am confident, that it still is fast enough for most every-day situations: I checked the code against a large SVG file from Wikimedia Commons: Geological map of East-Westfalia created by the author Hagar66 and it took my laptop 6.31 to 8.60 seconds to run the svg2paths command on it, although the file has 22,920 paths and also some groups with transformations in it.

According to my experience often it is more efficient, to parse a file of unknown size twice: once to count the number of objects in it and then allocate the memory needed. Then, in a second run, the data is actually parsed and written to the allocated memory. This works well i.e. with large numerical data files. In this case however, the space needed in memory to store a path object is not clear a priori. So the pre-parsing and serialization will probably create more overhead than it speeds up the code.

So I think there is different ways of continuing from here for the group implementation:

  1. The quick solution would be to do a minimal implementation that only applies transformations to the d string. This should be well documented so that it will not cause confusion why other group features are not working.
  2. The full package requires a lot of work and will open a Pandora's box. Implementing full attribute parsing and overwriting path attributes by group attributes where needed would be the desired behavior in the long run, since it allows the user of the library to further process attributes like line width or color. However this easily becomes a very complex task, since some attributes might be overwritten only in part. And there are complex group features like clipping of paths.

So those are my thoughts on this. What do you think about how to proceed?

@mathandy
Copy link
Owner

mathandy commented Jun 4, 2017

@alphanoob1337 as soon as I've fixed the bug I'm currently working on (hopefully today, but I don't know -- I've got a lot on my plate), I'm going to play around with an implementation of your (1) -- I'll try a bottom-up approach to make it as self-contained as possible.

As far as (2) goes... I'm not sure how inheritance works in SVGs.
In general it looks like child attributes override parent attributes, but then there are examples where this is not true, e.g.

If these are the only two exceptions to the children-override-parents rule, then implementing (2) doesn't sound so bad... though that's a pretty big if.

To anyone reading: Do you know of any documentation where inheritance in SVGs is well-explained?

@alphanoob1337 like you say, it seems like (2) is opening pandora's box a bit, but it might be worthwhile to add an experimental implementation that comes with a warning in the documentation. On a somewhat related note, if your original top-down approach offers speed benefits, it might be worthwhile to offer it also as another "experimental feature".

@mathandy
Copy link
Owner

mathandy commented Jun 4, 2017

Just another quick note:
Since we're talking about flattening attributes (so our xml tree becomes a list of paths), it will likely be necessary to check that duplicate attributes are handled properly.
E.g.
<circle cx="25" cy="25" r="15" transform="translate(0,-10)" transform="translate(0,10)"/>
In a quick test, Chrome/JSFiddle ignores the second transform (the same happens with other attributes I tried).

@alphanoob1337
Copy link
Author

Hi @mathandy,

I was mostly concerned with the style attribute, because it requires full CSS parsing to resolve the inheritances. A group can have style properties, that the path overwrites but the others also apply to the path. Also there is style classes which can be defined for groups and path elements, that are also inherited according to rules. And last but not least, there is inline stylesheets in SVG that can address specific elements (groups or paths) by their respective ID and change their style.

For SVG 2 it is required for the user agent to parse the CSS. In the SVG 1.1 documentation I found the hint that the CSS cascade must be obeyed if CSS is parsed (this was optional for SVG 1.1). The inheritance rules for CSS are given in chapter 6 of the the CSS documentation. Those can get really complicated and this fact alone was reason for me not to deal with inheritance at all. But maybe there is some Python code for CSS parsing around, you can just import as a dependency...

And another thought: I would intuitively say that clip-paths are inherited similar to transformations so that all of them apply to the path in the respective order...

Cheers
alphanoob1337

@sbliven
Copy link

sbliven commented Jul 4, 2017

Support for transformations will be very useful. Inkscape uses transformation matrices on many objects by default, resulting in completely misplaced paths. I hope the code-style issues get resolved quickly.

@alphanoob1337
Copy link
Author

Hi @sbliven,

This is not only the case for Inkscape but also for Adobe Illustrator.

If you don't care about the style properties of the paths but only their geometry, you can download my fork https://github.com/alphanoob1337/svgpathtools . I think it is not fully up to date, but mostly.

Cheers
alphanoob1337

@mathandy
Copy link
Owner

@alphanoob1337 and @sbliven, I'm with you. I just pushed a partially implemented fix. The idea being to add a container for a DOM-style document. It's not fully implemented but I was pretty verbose in my plan in the dostrings (see here). This will also add support for modifying SVG-Path elements inside html files (or any DOM-style file).

Typical usage will look something like:

    >> from svgpathtools import *
    >> doc = Document('my_file.html')
    >> for p in doc.paths:
    >>     foo(p)  # do some stuff using svgpathtools functionality
    >> foo2(doc.tree)  # do some stuff using ElementTree's functionality
    >> doc.display()  # open modified document in OS's default application
    >> doc.save('my_new_file.html')

There will also be (or so I'm planning). A "flatten_paths" function or method to remove all transform attributes from the document and apply them directly to the paths.

To anyone reading:
I believe this is an important improvement to svgpathtools, but I know very little about html, javascript, etc. (I'm a mathematician). If you see any problems with how I'm coding (or more importantly, the philosophy or practicality of) this new Document class. Please let me know.

I've got very little time to spare on svgpathtools these days, but I'm hoping to finish this new feature in the next few weeks.

Also: Anyone have a good idea how to implement automated testing (unittests) for this?

@DavidAntliff
Copy link

It would also be useful for the id attribute of each path's parent group to somehow be available after calling svg2paths. As it is right now, the group information is lost and there's no simple way to know which group a path came from.

@nishadhka
Copy link

@mathandy Many thanks for the library. @alphanoob1337 support for transformation will be much helpful for extracting geographical maps in PDF files as embedded as svg path segments. Is this feature is enabled in the current version? how to use that. As per this method to preserve relative position of path segments, how to collect the transformation matrix for each path segments, is this available as like path and attributes collection by svg2path,

@alphanoob1337
Copy link
Author

@nishadhka there is an intrinsic problem with this and this led to the proposed changes not being included in the library: It is not possible to map the group transformations onto attributes. A simple example would be the line width of a circle wich you stretch along the x axis by a factor of two: if the thickness attribute of the path is two, the resulting ellipse after the group transformation will have a line width varying from two to four along the path, because the object is first created and then transformed. This can't be mapped on a flat list of paths but would need some Python implementation of SVG groups which is a lot of work to do and will probably cause problems if you want to use the path objects with other libraries.

If you are only interested in the path itself, one can use the fix I proposed. If you need the attributes such as line width to be transformed as well, this will not do the job for you. If you only need the path itself, please tell me, then I will provide you an updated fork of this library which you can use.

@nishadhka
Copy link

nishadhka commented Mar 28, 2018

@alphanoob1337, Your transformation support just works for the requirement, thank you very much. I used the svg2paths from your fork and it's parse returned transformed svg paths with its relative position, which was then converted into as Shapely coordinates and into polygon. The pdf page is as below

which was converted into svg using Inkscape command line tool and then required path/geometry extracted from the list using svgpathtools, with out using the transformation, each polygon's(in svg, these polygons are separate Line Path segments) lost its relative position and output ends up in pile of polygons placed one above the other, the transformation does the correct positioning of each polygon, the resultant Shapley polygon is as below.

There are minor geometric issues which I guess happened during conversion into polygon. It was requiered to flip the resultant polygon horizontally to get the desired output. The attribute issue was a problem in using wsvg, writing the svg, the shapley polygon conversion saved me from it.

@mxgrey
Copy link
Contributor

mxgrey commented May 2, 2018

I'm working on a python project that needs to consume SVG files, and ideally I would like to be able to traverse the original group structure of the SVG after the file is parsed, but still have convenience functions to flatten out the paths from the groups that they're nested inside of.

There are other python SVG parsing libraries that seem to handle the group transforms, but their APIs for traversing the group structure as a user seems rather lacking. Based on what I've seen, I think the svgpathtools library has the most comprehensive design, and the code in the ElementTree branch looks promising.

I think the one design change I would propose to ElementTree would be to add a Group class, which would allow for a more natural tree of Groups (It looks like the current design intends to have a tree of xml.etree.cElementTree objects?). Then for the Document class, I can imagine three choices:

  1. Have Document extend the Group class, and make all the ungrouped paths part of the Document group.
  2. Give the Document class a default anonymous Group instance with no attributes that would contain all the ungrouped paths.
  3. Store any ungrouped paths in a list held by the Document class. I think this would be my preferred approach.

I'm happy to make these changes, implement the missing features, and write unit tests for all the currently untested features if you'd be on board with this idea. Since I have deadlines to meet, I'll probably move forward on this very soon (with or without your buy-in), but ideally I would like to submit any progress that I make upstream to you.

Disclaimer: I'm rather new to python (my expertise is in C++), so when I get around to submitting a pull request, there may be some novice python-specific issues that need correcting.

@mathandy
Copy link
Owner

mathandy commented May 3, 2018

Hey @mxgrey , that sounds good! @alphanoob1337 already implemented some code to flatten groups which might be useful.

Assuming you meet the following two requests, I'll merge in whatever pull-requests you make.

  1. Implement the new features in a way that won't interfere with older functionality -- i.e. any old scripts made with svgpathtools should still function the same. I don't imagine that will be a problem, but let me know if it is.
  2. Add a little explanation at the top of the README.ipynb demonstrating the new functionality.

Thanks! What your suggesting would be the most significant improvement to svgpathtools since its debut. Once it's had a month or so to be tested by others, I'll make a new release on PyPI so your new code will be included in pip installs. So make sure it works please :)

If you want advice on anything, just ask!

@mathandy
Copy link
Owner

Looks like we have a solution to this now thanks the new Document class pushed recently by @mxgrey !
I haven't done much testing, but I'm impressed -- it looks like really good work.

I'll leave this issue open for a bit to encourage input on any minor changes that should be implemented before posting this on the PyPI -- it'll be tagged svgpathtools v2.0 when I do post it.

The one significant change I'm likely to implement is to have flatten_all_paths return a list of Path objects -- the ElementTree element and transform can be stored as attributes. Do you see any problems with this @mxgrey ?

Two minor changes so far:

  • all np.matrix() calls were replaced with np.array() -- numpy is deprecating matrix objects.
  • the save() method now requires a filename.

@mxgrey
Copy link
Contributor

mxgrey commented Aug 23, 2018

The one significant change I'm likely to implement is to have flatten_all_paths return a list of Path objects -- the ElementTree element and transform can be stored as attributes.

That's totally reasonable. I used named tuples to avoid treading on the Paths class, but if it's okay to make those attributes of Path, then I think that offers a much cleaner API.

@tatarize
Copy link
Contributor

Some of the earlier discussions about going backwards up the line made me suspect some flaws were going to crop into that code. Since the CTM is always the concatenated result of all previous groups, and the groups list of transforms would need to be processed backwards I'd have a great a-ha comment.

But, nope. Even with my expertise having written svg parsers before and multiple readings of the SVG documentation. That's solid code. I looked and the only thing I could find is that, technically, according to the BNF: Tab, Linefeed, and Carriage Return are also white spaces. So:

<g transform="
translate
     (-10,-20)
scale
     (2)
rotate
     (45)
translate
     (5,10)
">

Is technically correct but doesn't seem like it would be parsed (see parser.py 223, 224). But nobody does that.

@mxgrey
Copy link
Contributor

mxgrey commented Aug 27, 2018

according to the BNF: Tab, Linefeed, and Carriage Return are also white spaces.

I suspect that would still work, unless I'm overlooking something.

On this line we have type_str contain everything that comes before the opening parenthesis (we've already split up the string based on the closing parentheses, so there shouldn't be any contamination from a previous transform in the sequence). Then we look for each keyword anywhere inside of type_str. I think tabs, newlines, and other forms of spacing between the keyword and opening parenthesis will be ignored, if my understanding of Python's in operator for strings is correct. Just in case I'm wrong about this, it would be good to add a unit test for this scenario.

One issue that I would be concerned about is that we might be too lenient about what kind of transform strings we accept. For example, translate scale (-10, -20) will currently be accepted and read as if it said translate(-10, -20), but that's a malformed SVG. So I would pose the question of whether we want the code to aggressively identify and error out when given a malformed SVG, or should we just declare that malformed SVGs may result in undefined behavior?

@tatarize
Copy link
Contributor

For that you're right. From my take it slices at the ) of the parameters between the items. Then would hand it.

translate
     (-10,-20

The slice with the ( would give it:

"translate
     "

and

"-10,-20"

So yeah with in it would certainly not have an issue, the line I was looking at was:

value_str = value_str.replace(',', ' ')
values = list(map(float, filter(None, value_str.split(' '))))

So maybe feed it something like rotate

<g transform="
rotate
     (   	 	 45)
">

Though that might fall into the same successful parsing anyway. Or it might try to cast some spaces and tabs into being floats. Even with malicious intent I can't see a clear way to fully malform it consistent with the SVG BNF but inconsistent with the parsing here.

you could do something like do_not_scale(2) and the SVG says that becomes SVG_TRANSFORM_UNKNOWN but it would scale, but those are generally malformed anyway so you should expect inconsistent behavior.


I suppose the 2.0 recommendations include allowing transform on the svg object itself, and Document specifically looks for the group objects and doesn't account for that. But, if dinging you on 2.0 recommendations were permitted there's clearly no bearing bB command or z curve closing but all of that falls into different issues, that are feature requests and not bugs (also, everybody loves z curve closing, but bearing is a nightmare).


The document parsing in the Document.py will certainly work and be reasonably effective. There is however a slightly less flat but perhaps more effective way of doing that. The SVG inheritance is generally that all children of all groups inherit all properties. The transforms are concatenated and the same values are replaced.

Programatically SVGs can better represented the same nested structure in the SVGs themselves and a dictionary of subproperties and a matrix, while this sort of looks like a DOM tree, you can kind of implement it more like a light-tree structure parsed with a SAX parser, with end nodes that generally have a path, a matrix, (if needed) a local dictionary, and an inherited dictionary. Where each parent node gives a copy of its dictionary and matrix to its children (when that tag for the child opens). Basically a minimum viable preservation of most svg documents. You basically pre-flatten the end nodes, not just for the transformations, but also for the dictionary of properties you might care about, and preserve the structure a bit.

While there's some obvious advantages to keeping the entire DOM. Unless you want to save things again with entirely preserved structure, it's not really worth it. Consider something like a fill color. If you had to go back and find all the groups and then check for the most recent parent, or current object with a fill="lime" or style="opacity:1;fill:#000" to figure out the expected fill color for the SVG it might be kind of nightmarish. But, it wouldn't be too hard to find all nodes with paths, and query that node dictionary for its value for "fill". And you could use that data to rebuild the document.

You could also just ignore all that structure, use a SAX parser and parse all the elements maintaining a stack of matrices and dictionaries as it goes through the tree, and add each leaf node to the list, as you go. Which you could still build an SVG from but you'll lose the structure. Which seems like it would do the same thing the flatten_all_groups is doing but will simply load the SVG document flat, rather than load the entire DOM and search it for that data, and make a point to maintain the properties that apply to that particular path.

When I implemented this I had the parsing of the document generate paths prefixed with a fake svg-path commands I called "x" for matrix which literally just contained the six values for the matrix, and "f" for fill_color and "k" for stroke_color, followed by the RGB integer for those colors. In my usecase I didn't care about more than that, and that basically entirely defined all shapes I cared about.

@tatarize
Copy link
Contributor

Figured it my comments might have been a little bit oblique with regard to SAX parsing for the flattened values. Basically if you want to parse the SVG document quickly, you need something like in the pull request #66 there. I obviously missed some spots I shouldn't have. And there might be better ways to do some of that, or more compatible ones.

You should be able to tell from it's much faster. Like day and night. Less memory problematic. But, moreover since you're already skidding through the file, it's not actually that hard to track the matrix or the process the paths. We are just extracting the information we need rather than loading the entire DOM then collapsing those elements. It also allows you to properly keep track of the fill and stroke values which oddly enough people will care about.

@tatarize
Copy link
Contributor

tatarize commented Sep 3, 2018

Sax Parsing: 15.340695629785454
22920
Dom Parsing: 17.065750999880045
22920

import time
from svgpathtools import *

filename = "East-Westfalia_-_Geological_Collection.svg"

t0 = time.clock()
doc = SaxDocument(filename)
t1 = time.clock()

total = t1-t0
print("Sax Parsing: " + str(total))
print(len(doc.tree))

t0 = time.clock()
doc = Document(filename)
paths = doc.flatten_all_paths()
t1 = time.clock()

total = t1-t0
print("Dom Parsing: " + str(total))
print(len(paths))

Doesn't actually seem much faster, though seems like there's likely some ways to speed that up.

@tatarize
Copy link
Contributor

tatarize commented Sep 4, 2018

Got it to about twice as fast. Switching over to iterparse and skipping applying the matrix if it is identity (which should be basically always).

Sax Parsing: 6.087709155595086
22920
Dom Parsing: 14.686225500378885
22920

But, it basically already reads flat, with only processing or caring about the leaf nodes and the properties inherited by the leaf. And taking that matrix equally in stride compounding in the stack as it iterates through.

@tatarize
Copy link
Contributor

tatarize commented Sep 6, 2018

@mxgrey I did finally find a bug! #70

@geooff
Copy link

geooff commented May 8, 2019

Are there any plans to include support for transformations? I see a lot of good discussion here and it seems @alphanoob1337 has produced some limited functionality but are there any plans to include this functionality in a future release?

@tatarize
Copy link
Contributor

tatarize commented May 9, 2019

Do you mean as a proper release here and on PyPI? Both the Sax and Dom functionalities do transformations but it does look like the last release was done before that functionality was added.

@geooff
Copy link

geooff commented May 9, 2019

When building from scratch the image below still cannot be rebuilt without alteration using the following code taken from the README.

sample_image

# Update: You can now also extract the svg-attributes by setting
# return_svg_attributes=True, or with the convenience function svg2paths2
from svgpathtools import svg2paths2, wsvg
paths, attributes, svg_attributes = svg2paths2('svg-logo-h.svg')

# Let's make a new SVG that's identical to the first
wsvg(paths, attributes=attributes, svg_attributes=svg_attributes, filename='svg-logo-h_processed.svg')

Note: The issue could be with something other than the inclusion of transforms I'm still learning about the SVG format.

@tatarize
Copy link
Contributor

Looking at that code, my best guess is that the code here doesn't actually load the def tags. Specifically the way that code is working is that it defines a def tag and then calls that def tag later in the body of the SVG. While this is clearly a valid SVG, it's not, as far as I'm aware something that either the DOM or SAX parser deals with.

@mathandy
Copy link
Owner

I just pushed some improvements to the Document functionality. I'd like to be able to do the following but we're not quite there yet.

from svgpathtools import *
doc = Document('heart.svg')

# rotate all paths 90 degrees
flattened_rotated_doc = Document()
for p in doc.paths():
    flattened_rotated_doc.add_path(p.rotated(90), p.element.attrib)

flattened_rotated_doc.display()

My "heart.svg" is the following example (taken from here).

<svg viewBox="-40 0 150 100" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  <g fill="grey"
     transform="rotate(-10 50 100)
                translate(-36 45.5)
                skewX(40)
                scale(1 0.5)">
    <path id="heart" d="M 10,30 A 20,20 0,0,1 50,30 A 20,20 0,0,1 90,30 Q 90,60 50,90 Q 10,60 10,30 z" />
  </g>
 
  <use xlink:href="#heart" fill="none" stroke="red"/>
</svg>

@tatarize
Copy link
Contributor

If I wanted to do that I think my code would be:

doc = SVG('heart.svg').parse(transform='rotate(90)')

Also, you could add some dunder methods to documents. Assuming you're adding a path to that object.

flattened_rotated_doc = Document()
for p in doc.paths():
    flattened_rotated_doc += p.rotated(90)

flattened_rotated_doc.display()

Also, if you want to store transforms like that on paths, I have some code for those various elements. Matrix does all the standard matrix operations and stores the variables as defined in the svg spec. It depends on some definitions for the classes Length and Angle but these are entirely with regards to the CSS code, that's part of the standard. You can legally in modern SVG do commands like rotate(0.2turn) and translate(1in, 1in), so parsing the transforms required that stuff. I'd need to slightly tweak how points are used in the class going back to .real and .imag (my Point class is backwards compatible with that type of point but still really a class), but that's very minor tweaks to like 2 functions. But, it has built in parsing for transform strings. Matrix('rotate(90)') becomes an object you can manipulate. Pretty easily. Or just get the six values from (svg standard defines the 6 value form).

Full fledged parsing is hard, and I had to read through so many different technical documents to get mostly right. But, I did build things really modularly so you could just use my Color class that conforms to the svg spec without needing any of the rest of the code. Color('#FF0'), Color('hsv(30, 10%, 40%)'), Color('maroon').

To get all this stuff right I had to add Rect Circle Ellipse and other such objects. As well as weird things like Viewbox, because a viewbox can basically define a kind of equivalent transformation if it's zoomed in, you gotta make the stuff in it a lot larger with the right meet or slice values and preserving the aspect ratio. (SVG 2.0 8.2 equivalent transform of an SVG viewport.).

TL;DR SVG Parsing is a deep deep hole.

Though if you want, the patch for the use/def objects is posted in #90. You can just create a stealthy generator for the iterparse generator to get the use stuff to work.

The code you have is fantastic with math, in fact several of your bbox() ended up in my svgelements But, even compared to svg.path is a bit dated with the Paths. regebro added in Move and Close classes to allow for some subpaths and corrected a few bugs. If you want SVG parsing, go ahead and steal liberally from svgelements really the thing that keeps bringing me back here is the fantastic math functions you have. I desperately want your cubic bezier length code for the contingency when scipy is installed but I can't condense it to a easy to understand function. If I could copy and paste most of them, I would do that in a heartbeat.

I made one such function for Arc, if I had cubics I'd have a full set of functions with scipy that would instantly give exact lengths:

    def _exact_length(self):
        """scipy is not a dependency. However, if scipy exists this function will find the
        exact arc length. By default .length() delegates to here and on failure uses the
        fallback method."""
        from scipy.special import ellipeinc
        a = self.rx
        b = self.ry
        phi = self.get_start_t()
        m = 1 - (a / b) ** 2
        d1 = ellipeinc(phi, m)
        phi = phi + self.sweep
        m = 1 - (a / b) ** 2
        d2 = ellipeinc(phi, m)
        return b * abs(d2 - d1)

@wesbz
Copy link
Contributor

wesbz commented Feb 25, 2021

Hi, I was playing around with your tool and notice that the transformations described in group tags do not affect the contained paths yet. Where is this project at? Just to know how I may help.

@tatarize
Copy link
Contributor

tatarize commented Feb 26, 2021

I believe in this implementation which I have sort of come around to it's under flatten() or flatten_groups() . You can modify the groups then call flatten and get it to respect that somewhat. My own project svgelements for example would only ever work during the initial parsing phase which is basically like flatten here. Or within a svg conforming webbrowser part of the rendering processing. I recently added groups to my project so if you take a bunch of paths add them to a group then multiply that group by a transformation it will respect that and transform all the contained items. But, this is lossy. group * "scale(1.2)" would scale it up by 20% but you couldn't unscale them by removing them from the group.

Part of the issue here is a larger one with all the various svg implementations. They aren't really subsets of doms. Which is what the document class here is meant to address. I think meerk40t/svgelements#87 captures the problem fairly well. I think what needs to be done is take all the code from svgwrite which is mostly just a pure svg dom application for writing svg to disk combine it with a proper element and node dom system (though likely not minidom since it adds nothing), then add-in bootstrapping of element nodes, and file reading. And then do some of the same processes that svgpathtools, svg.path, and svgelements during a render routine, from that dom, which references back to the original nodes but is not those original nodes.

Flatten should work here if you modify the group and you need to propagate them down. In svgelements you can multiply a group containing things by a matrix and it'll respect that. But, like most things in these render-only libraries it's kinda destructive.

@alphanoob1337
Copy link
Author

Just to adding more context to tatarize's answer: It is important to understand that many transformations can inherently not be flattened out without loss of information. Do you just want the coordinates of the path to be updated according to the group transformations or do you want to render an SVG with transformations? The latter is harder to achieve: Imagine how scaling a complicated path with a certain line width along the x-axis but NOT along the y-axis using a group transformation will give you a line with varying thickness in most SVG rendering software. In many cases flattening the transformations will give you the expected result (i.e. in a filled shape). If you are just interested in the geometry of the path but not in rendering it you can completely ignore this comment because flattening is what you want.

@tatarize
Copy link
Contributor

tatarize commented Mar 1, 2021

Having read the SVG docs a bunch of times. While the behavior alpha describes is how that often works. The SVG Spec is kind of silent on the issue. My own project scales the stroke widths by the square root of the determinant of the matrix in that case which would scale up the strokes but would not create non-uniform stroke scaling.

Having the actual values in the object update based on the group modification would require re-rendering a properly dom based system (which I'm planning on writing and doing like that). You'd have to take the change in group and flag everything below it as dirty and then process the rerender if any function tried to get an update on that point.

@alphanoob1337
Copy link
Author

Yes, with "expected behaviour" I was referring to how Inkscape handles SVG group transformations. I see Inkscape as the de facto reference implementation of the SVG standard, but that is a personal opinion and in this case there is no conflict with the W3C standard. For some projects rendering is irrelevant (i.e. I was using svgpathtools for the optimization of SVG files for laser cutters). Even for most projects where rendering matters, tatarizes approach will match the expected result or will be very close to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants