Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NonEmptyVector and NonEmptyMap #836

Open
TimPigden opened this issue Dec 23, 2021 · 13 comments
Open

NonEmptyVector and NonEmptyMap #836

TimPigden opened this issue Dec 23, 2021 · 13 comments

Comments

@TimPigden
Copy link
Contributor

TimPigden commented Dec 23, 2021

Hi, My data models make extensive use of NonEmptyVector from cats. And I have desire for NonEmptyMap that's not based on sorted data. Would they have a place?
I notice the NonEmptySet here is not based on TreeSet - unlike cats where for reasons I don't really understand they can't use Set
"The problem is that the std lib Set type is unparametric and a non-empty version purely based on that one wouldn't find a place in cats. Any set based on hashing would have to take a Hash[A]."
Could predude NonEmptyVector and Map simply wrap the default Scala ones?
(in which case I could knock it up pretty easily)

@adamgfraser
Copy link
Contributor

ZIO already has a NonEmptyChunk data type that you can use as a replacement for NonEmptyVector. A NonEmptyMap that wraps a Map similar to how NonEmptySet wraps a Set would certainly be welcome.

@TimPigden
Copy link
Contributor Author

Adam - if you remember I had a shot at benchmarking Chunk for a hackathon about a year ago (sadly I never got to finish). at the time the result was that performance of chunk could be better or much worse than vector for random access, depending on how the Chunks were created - whereas Vector is consistent. Has that changed?

@adamgfraser
Copy link
Contributor

There has been significant work on performance since then though random access will always be faster if you materialize it first.

@TimPigden
Copy link
Contributor Author

ok, nice - I've just been looking at chunk. I'm planning a big cats->prelude splurge on my code base for a few weeks time. I'll do the NonEmptyMap then

@adamgfraser
Copy link
Contributor

Great! Let me know if you need any help!

@TimPigden
Copy link
Contributor Author

I'm looking at this now. I've been thinking, I ought to do the NonEmptyVector at the same time - the reason being anyone coming over to prelude from cats may be using NonEmptyVector and not be entirely trusting that Chunks are going to give them exactly the same or better performance characteristics.
And - a weakness in the cats NonEmptySet implementation is it entirely based on TreeSet, which is
a) less efficient and b) requires creating of an ordering.
You've gone the other way and only create a HashSet implementation. So if I can make it without too much disruption I'll see if we can have a NonEmpty TreeSet and TreeMap as alternative implementations.
@adamgfraser

@TimPigden
Copy link
Contributor Author

TimPigden commented Feb 19, 2022

@adamgfraser ok, a bit of help required.
I downloaded master
I've written all the code, I've absolutely no idea what I'm doing with respect to cross-compiles and so on (we simply don't need it at work)
or what version of zio I should be running.
Is there a simple hack for the build environment so I can compile and test under my current dev setup without all this scalanative and js stuff? And how am I supposed to make it work with scala3?
i'm currently running 2.13.8 and either zio2 or 3
I think I can manage to ensure the actual collection stuff is compatible with 2.12 style collections

@adamgfraser
Copy link
Contributor

If you just run SBT and go to one of the JVM projects you can test that on just the JVM and the latest Scala version. Eventually it will need to pass on all Scala versions but I don't think there should be a lot of issues there.

I worry that a NonEmptyVector is not really a data type we recommend using (we should just use NonEmptyChunk instead) and there is significant boilerplate code there in re-exporting all the collection methods. I would rather just make our optimal solution for a general purpose collection type fast than have two implementations for essentially the same thing.

For NonEmptySet, I think a NonEmptySet should just wrap anything that honors the Set interface and it doesn't really need to care what implementation is used under the hood, through when we construct our own set we would probably use a hash set. If you wanted to implemented a NonEmptySortedSet that provides additional operators when the set is sorted I think that could make sense.

@TimPigden
Copy link
Contributor Author

TimPigden commented Feb 19, 2022 via email

@TimPigden
Copy link
Contributor Author

TimPigden commented Feb 19, 2022 via email

@TimPigden
Copy link
Contributor Author

Actually on second thoughts I ought do the range stuff, so I'll need a parallel full implementation

@TimPigden
Copy link
Contributor Author

oops - ignore the pull request. I forgot the additional tree methods!

@TimPigden
Copy link
Contributor Author

ok, just to note what's been done:
3 new things added NonEmptySortedSet, NonEmptyMap and NonEmptySortedMap
NonEmptySortedSet is almost exactly the same as NonEmptySet - I tried to factor out the common stuff but all the types became an unreadable nightmare and I gave up. Likewise NonEmptyMap and NonEmptySortedMap are mostly the same.
NonEmpySortedSet has 2 todos in the code - I couldn't figure out what to do with CommutativeEither and Invariant with the implicit needed for the SortedSet.
And I didn't really understand any of that bit or the NonEmptySortedSetSyntax - so they will need to be checked I think.

sorted.. max/min and range type operations didn't need anything because they all subset the parent object and thus can produce empty collections - so defaulting to the underlying type is the correct behaviour there.

Finally I added some groupBy operations to NonEmptyMap - because groupby, will by definition produce things of NonEmpty type and so it seemed useful (I have such operations in my current codebase)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants