https://wiki.haskell.org/api.php?action=feedcontributions&user=Artyom+Kazak&feedformat=atomHaskellWiki - User contributions [en]2024-03-19T02:09:30ZUser contributionsMediaWiki 1.35.5https://wiki.haskell.org/index.php?title=Consultants&diff=62948Consultants2019-06-28T17:45:32Z<p>Artyom Kazak: /* Europe */</p>
<hr />
<div>== Australasia ==<br />
; Ben Lippmeier http://benl.ouroborus.net<br />
: [mailto:benl@ouroborus.net benl@ouroborus.net]<br />
: Sydney, Australia.<br />
<br />
; TLC SRC http://tlcsrc.ml<br />
: Steven Shaw<br />
: Blog: http://steshaw.org<br />
: [mailto:steven@steshaw.org steven@steshaw.org]<br />
: Brisbane, Australia.<br />
<br />
== Europe ==<br />
; Clockwork Consulting ApS https://www.cwconsult.dk/<br />
: [mailto:info@cwconsult.dk info@cwconsult.dk]<br />
: Copenhagen, Denmark. Remote work only (unless in Copenhagen, ask if so).<br />
<br />
; NilCons http://www.nilcons.com/<br />
: [mailto:cons@nilcons.com cons@nilcons.com]<br />
: Zurich, Switzerland.<br />
<br />
; Tweag I/O http://tweag.io/<br />
: [mailto:hello@tweag.io hello@tweag.io]<br />
: Europe, US, Russia, South America. Headquartered in Paris, France.<br />
<br />
; Monadfix https://monadfix.com<br />
: We help companies adopt functional programming (Haskell, Agda, PureScript, Clojure)<br />
: [mailto:hi@monadfix.com hi@monadfix.com]<br />
: Europe and Russia, remote work worldwide.<br />
<br />
; Well-Typed LLP http://www.well-typed.com/<br />
: The Haskell Consultants<br />
: [mailto:info@well-typed.com info@well-typed.com]<br />
: Europe and US on-site, remote work worldwide.<br />
<br />
; zerobuzz UG https://zerobuzz.net/<br />
: [mailto:info@zerobuzz.net info@zerobuzz.net]<br />
: Germany and Switzerland<br />
<br />
; Turing Jump https://turingjump.com/<br />
: [mailto:info@turingjump.com info@turingjump.com]<br />
: Europe on-site, remote work worldwide.<br />
<br />
== North-America ==<br />
; AppSolutions LLC http://www.appsolutions.com/<br />
: Anton van Straaten http://www.appsolutions.com/anton/<br />
: New York, Boston, Philadelphia, Washington DC<br />
<br />
; Boltmade http://boltmade.com<br />
: Waterloo, ON, Canada<br />
<br />
; Brett Letner http://www.linkedin.com/in/brettletner<br />
: Lawrence, KS, USA<br />
<br />
; Civic Labs [https://www.civiclabs.com https://www.civiclabs.com]<br />
: New York, NY and San Francisco, CA<br />
<br />
; LambdaPix http://lambdapix.com<br />
: Conal Elliott<br />
: Blog: http://conal.net/blog<br />
: San Andreas, CA, USA<br />
<br />
; Obsidian Systems [https://obsidian.systems https://obsidian.systems]<br />
: New York, NY<br />
<br />
; OM Consulting Limited ''[mailto:omconsult@gmail.com omconsult@gmail.com]'' :Intelligent solutions.<br />
<br />
; Pliosoft Corp. [https://pliosoft.com/ https://pliosoft.com/] <br />
: email ''[mailto:contact@pliosoft.com contact@pliosoft.com]'' <br />
: Haskell consulting and development<br />
: Alberta, Canada<br />
<br />
; Position Development [http://positiondev.com http://positiondev.com]<br />
: New York, NY<br />
<br />
; Sankel Software http://sankelsoftware.com<br />
: David Sankel<br />
: Albuquerque, NM, USA<br />
<br />
; Simon Michael http://joyful.com<br />
: Los Angeles<br />
<br />
; Stack Builders http://www.stackbuilders.com<br />
: New York, NY, USA<br />
<br />
== Asia ==<br />
; ByteAlly Software http://www.byteally.com/<br />
: Chennai, India<br />
<br />
[[Category:Community]]</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=Data_declaration_with_constraint&diff=61127Data declaration with constraint2016-09-19T23:26:04Z<p>Artyom Kazak: /* Solution */ Change the link again</p>
<hr />
<div>== Problem ==<br />
<br />
=== Question ===<br />
<br />
I have declared<br />
<haskell><br />
data C a => T a = Cons a<br />
</haskell><br />
<br />
and I hoped that now the type checker knows that every value of type <hask>T a</hask> satisfies the type constraint on <hask>a</hask>.<br />
I like to declare an instance for an type constructor class for the type constructor <hask>T</hask><br />
but its methods require type constraints that depend on the particular type constructor <hask>T</hask>.<br />
<br />
For example: <br />
<haskell><br />
instance Vector T where<br />
add (Cons x) (Cons y) = Cons (x+y) -- requires Num constraint on type a<br />
</haskell><br />
<br />
=== Answer ===<br />
<br />
In Haskell 98, only functions can have type constraints.<br />
The type constraint of a <hask>data</hask> only refers to the constructors.<br />
The designers of Haskell 98 do now think, that it was a bad decision to allow constraints on constructors. GHC as of version 7.2 disallows them by default (turn back on with <code>-XDatatypeContexts</code>).<br />
<br />
== Solution ==<br />
<br />
You could use ghc's [https://wiki.haskell.org/GADTs_for_dummies Generalised Algebraic Data Structures (GADTs)] to add an implicit context to the data constructors:<br />
<br />
<haskell><br />
data T a where<br />
Cons :: C a => a -> T a<br />
</haskell><br />
<br />
This way, functions using <hask>T</hask> no longer need a constraint. Instead, constructing a <hask>T a</hask> using <hask>Cons</hask> needs a <hask>C a</hask> context, and when something pattern matches a <hask>Cons x</hask>, the context comes with it.<br />
<br />
There has been some discussion about whether it is sensible to want this.<br />
<br />
A Haskell 98 workaround is to use [[multi-parameter type class]]es, where <hask>T a</hask> and <hask>a</hask> are separate arguments.<br />
<br />
== See also ==<br />
<br />
* Henning Thielemann in Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2004-March/005979.html Context for type parameters of type constructors]<br />
* Mark Nicholls in Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2007-December/036762.html nice simple problem for someone struggling....]<br />
<br />
[[Category:FAQ]]</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=Data_declaration_with_constraint&diff=61126Data declaration with constraint2016-09-19T23:23:26Z<p>Artyom Kazak: /* Solution */ Fix link</p>
<hr />
<div>== Problem ==<br />
<br />
=== Question ===<br />
<br />
I have declared<br />
<haskell><br />
data C a => T a = Cons a<br />
</haskell><br />
<br />
and I hoped that now the type checker knows that every value of type <hask>T a</hask> satisfies the type constraint on <hask>a</hask>.<br />
I like to declare an instance for an type constructor class for the type constructor <hask>T</hask><br />
but its methods require type constraints that depend on the particular type constructor <hask>T</hask>.<br />
<br />
For example: <br />
<haskell><br />
instance Vector T where<br />
add (Cons x) (Cons y) = Cons (x+y) -- requires Num constraint on type a<br />
</haskell><br />
<br />
=== Answer ===<br />
<br />
In Haskell 98, only functions can have type constraints.<br />
The type constraint of a <hask>data</hask> only refers to the constructors.<br />
The designers of Haskell 98 do now think, that it was a bad decision to allow constraints on constructors. GHC as of version 7.2 disallows them by default (turn back on with <code>-XDatatypeContexts</code>).<br />
<br />
== Solution ==<br />
<br />
You could use ghc's [https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#generalised-algebraic-data-types-gadts Generalised Algebraic Data Structures (GADTs)] to add an implicit context to the data constructors:<br />
<br />
<haskell><br />
data T a where<br />
Cons :: C a => a -> T a<br />
</haskell><br />
<br />
This way, functions using <hask>T</hask> no longer need a constraint. Instead, constructing a <hask>T a</hask> using <hask>Cons</hask> needs a <hask>C a</hask> context, and when something pattern matches a <hask>Cons x</hask>, the context comes with it.<br />
<br />
There has been some discussion about whether it is sensible to want this.<br />
<br />
A Haskell 98 workaround is to use [[multi-parameter type class]]es, where <hask>T a</hask> and <hask>a</hask> are separate arguments.<br />
<br />
== See also ==<br />
<br />
* Henning Thielemann in Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2004-March/005979.html Context for type parameters of type constructors]<br />
* Mark Nicholls in Haskell-Cafe: [http://www.haskell.org/pipermail/haskell-cafe/2007-December/036762.html nice simple problem for someone struggling....]<br />
<br />
[[Category:FAQ]]</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=Regular_expressions&diff=59155Regular expressions2014-12-16T15:17:30Z<p>Artyom Kazak: /* Overview */ added text-icu</p>
<hr />
<div>[[Category:Libraries]]<br />
<br />
== Overview ==<br />
<br />
Chris Kuklewicz has developed a regular expression library for Haskell that has been implemented with a variety of backends. Some of these backends are native Haskell implementations, others are not and rely on external C libraries such libpcre. New users may feel overwhelmed with the various options that are available to them. The following table provides an overview of the various features supported by each backend.<br />
<br />
There are also a number of alternate or complementary regular expression libs, including:<br />
<br />
* Bryan O'Sullivan's text-icu – bindings to the [http://site.icu-project.org/ ICU library], which includes Perl compatible regexes with extended Unicode support. One of few regex libraries working with Text and not String.<br />
* Yoshikuni Jujo's regexpr - Regular expression library like Perl and Ruby's regular expressions<br />
* Don Stewarts's pcre-light - A small, efficient and portable regex library for Perl 5 compatible regular expressions<br />
* Martin Sulzmann's regexpr-symbolic - Equality, containment, intersection among regular expressions via symbolic manipulation<br />
* Matt Morrow's regexqq - A quasiquoter for PCRE regexes<br />
* Uwe Schmidt's hxt-regex-xmlschema - supports full W3C XML Schema regular expressions inclusive all Unicode character sets and blocks. A tutorial is available at [[Regular expressions for XML Schema]].<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|+ Feature Matrix of Backends<br />
! Backend<br />
! Grouping?<br />
! POSIX/Perl<br />
! Speed<br />
! Native Impl?<br />
! Stable?<br />
! Lazy?<br />
! Comments<br />
|-<br />
| regex-posix<br />
| Yes<br />
| POSIX<br />
| very slow<br />
| No<br />
| Yes<br />
| No<br />
|<br />
|-<br />
| regex-parsec<br />
| Yes<br />
| POSIX,Perl<br />
| slow<br />
| Yes<br />
| Yes<br />
| ?<br />
|<br />
|-<br />
| regex-tre<br />
| Yes<br />
| POSIX<br />
| fast<br />
| No<br />
| No<br />
| ?<br />
| uses buggy libtre (v0.7.5)<br />
|-<br />
| regex-tdfa<br />
| Yes<br />
| POSIX<br />
| fast<br />
| Yes<br />
| Yes<br />
| Yes<br />
| full Posix compliance<br />
|-<br />
| regex-pcre<br />
| Yes<br />
| Perl<br />
| fast<br />
| No<br />
| Yes<br />
| ?<br />
|<br />
|-<br />
| regex-pcre-builtin<br />
| Yes<br />
| Perl<br />
| fast<br />
| No<br />
| Yes <br />
| ?<br />
|<br />
|-<br />
| regex-dfa<br />
| No<br />
| POSIX<br />
| fast<br />
| Yes<br />
| Yes<br />
| ?<br />
| <br />
|-<br />
| regexpr<br />
| Yes<br />
| Perl<br />
| ?<br />
| Yes<br />
| Yes<br />
| ?<br />
| easier for newcomers from other languages; 0.5.1 leaks memory in some cases<br />
|}<br />
<br />
'''Note:''' speed is something that should be benchmarked by the actual user, since<br />
the story changes so much with the task, new GHC, compiler flags, etc. <br />
The algorithm used may be a useful thing (backtracking vs NFA/DFA).<br />
<br />
All support String, (Seq Char), ByteString, and (except for regex-posix) ByteString.Lazy.<br />
<br />
All are available from [http://hackage.haskell.org/packages/archive/pkg-list.html#cat:Text Hackage] as tar.gz sources and from [http://darcs.haskell.org/packages/regex-unstable/ darcs].<br />
<br />
=== regex-base ===<br />
<br />
This package exports Text.Regex.Base which re-exports Text.Regex.RegexLike and Text.Regex.Context. These do not provide the ability to do matching, but provide the type classes which all the other regex-* backends use.<br />
<br />
The backend packages also import the utility module Text.Regex.Impl to streamline instance declarations.<br />
<br />
The 0.71 version has a "tail" bug in one of the instances of RegexLike:<br />
<haskell><br />
instance (RegexLike a b) => RegexContext a b (MatchResult b) where <br />
</haskell><br />
which I hope to be fixed in ghc 6.6.1. Getting the unstable version of regex-base also fixes this, though you will have to get and re-compile all the other regex-* modules as well.<br />
<br />
The versions of the regex-* backends that come with GHC 6.6 do not re-export the RegexLike classes, so the usage of regex-BACKEND is<br />
<haskell><br />
import Text.Regex.Base<br />
import Text.Regex.BACKEND<br />
</haskell><br />
The versions in unstable are being upgraded to re-export RegexLike, so the usage will be simplified to<br />
<haskell><br />
import Text.Regex.BACKEND<br />
</haskell><br />
<br />
The 0.71 version of regex-base only has Extract instances for [Char] and ByteString. The unstable version also provides instances of ByteString.Lazy and (Seq Char). This Extract support must be accompanied by adding support to each regex-* backend in unstable, and that work is in progress.<br />
<br />
The RegexMaker class in v0.71 had no way to gracefully report errors in parsing the regular expression itself. This was a design mistake and has been extended in the unstable version of regex-base to provide monadic version which can fail gracefully.<br />
<br />
The RegexLike class only provides support for positions in the source text indexed by Int. I am still considering how to provide Int64 support. The best way is probably going to be another class, which will allow for more generalized index type support. Different backends will, by necessity, have different instances for extended index types.<br />
<br />
=== regex-tdfa ===<br />
<br />
Chris Kuklewicz has just released <code>regex-tdfa</code>, (Tagged Deterministic Finite Automata), a new library that works with GHC, the most recent being ghc-6.10.1. It is POSIX compliant and tested against <br />
[http://www2.research.att.com/~astopen/testregex/testregex.html the AT&T tests].<br />
<br />
This is available on hackage at [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-tdfa regex-tdfa] and via [http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/ darcs].<br />
<br />
The [http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/doc/html/regex-tdfa/Text-Regex-TDFA.html haddock documentation] is also on the darcs site.<br />
<br />
This uses a tagged DFA like the TRE c-library to provide efficient Posix matching. It also defaults to true Posix submatch capture (including ambiguous *-operator subpatterns), but this extra effort can be disabled.<br />
<br />
The versions from 0.90 and up use mutable ST arrays to keep track of data during matching and have thus have both decent speed and decent memory performance. Previous versions drove the memory use too high, overworking the garbage collector.<br />
<br />
The versions from 1.0.0 and up improve the algorithm. The search time is now O(N) for text of length N in the worst case, while still providing correct POSIX capturing and while running in bounded space.<br />
<br />
By disabling submatch capture (see the <code>captureGroups</code> field of <code>ExecOptions</code>) this library avoids the extra work and should run faster ("non capture" case, this is also used if there are no parenthesis in the regular expression). By running in single line mode (see the <code>CompOptions</code>) and with a leading ^ anchor this library also avoids extra work and should run faster ("front achored" case). Doing both optimization should run faster still.<br />
<br />
Just testing for a match stops at the shortest found match and should be fast (using matchTest or match/mathM for a Bool output), and this also tries to optimize for the "front anchored" case.<br />
<br />
The major advantage over pcre is avoidance of exponential blowup for certain patterns: asymptotically, the time required to match a pattern against a string is always linear in length of the string. This O(N) scaling is [http://article.gmane.org/gmane.comp.lang.haskell.cafe/53943 now achieved] even in the worst case and when returning the correct Posix captures.<br />
<br />
As of version 1.1.1 the following GNU extensions are recognized, all anchors:<br />
<br />
<pre><br />
\` at beginning of entire text<br />
<br />
\' at end of entire text<br />
<br />
\< at beginning of word<br />
<br />
\> at end of word<br />
<br />
\b at either beginning or end of word<br />
<br />
\B at neither beginning nor end of word <br />
</pre><br />
<br />
The above are controlled by the 'newSyntax' Bool in 'CompOption'.<br />
<br />
=== regex-posix ===<br />
<br />
See [[Regex Posix]] for bug reports relating to your operating system.<br />
<br />
This backend provides a Haskell interface for the "posix" c-library that comes with most operating systems, and is provided by '''include "regex.h"'''. This c-library probably also drives command line utilities such as sed and grep.<br />
<br />
"Posix" is in quotes since it is often not fully Posix compliant and may be buggy (as on OS X, where the bug also affects sed).<br />
<br />
And the c-library has what I call impossibly-slow performance, as in at least 100x slower than other regex engines.<br />
<br />
The goal of regex-tdfa is to create a replacement for regex-posix to accompany a future version of GHC.<br />
<br />
=== regex-compat ===<br />
<br />
This takes regex-posix and presents a Text.Regex api that mirrors the one that came with GHC 6.4.x for compatibility.<br />
<br />
=== regex-pcre ===<br />
<br />
This wraps the pcre c-library from http://www.pcre.org and gives all the Perl regular expression syntax you might want. This is especially efficient with Data.ByteString.<br />
<br />
=== regex-pcre-builtin ===<br />
<br />
This is the same as regex-pcre, but comes bundled with a version of the pcre C library.<br />
<br />
=== pcre-light ===<br />
<br />
Another FFI binding to PCRE; Don Stewart's pcre-light is intended to be "A light regular expression library, using Perl 5 compatible regexes", with support for Strings and strict ByteStrings. It is available on [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/pcre-light Hackage], or through a [http://code.haskell.org/~dons/code/pcre-light/ Darcs repo]; see also the [http://www.haskell.org/pipermail/haskell/2008-January/020120.html original announcement].<br />
<br />
=== regex-tre ===<br />
<br />
This wraps the TRE c-library from http://laurikari.net/tre/ which provides Posix regular expressions. The current 0.7.5 version of TRE is buggy, however, so you should avoid this library unless you test your regular expressions to ensure you avoid the bugs.<br />
<br />
The author of TRE is currently working to fix these bugs.<br />
<br />
Also, the Posix compliance in 0.7.5 fails for *-operators with ambiguous captures.<br />
<br />
=== regex-dfa ===<br />
<br />
This is the only LGPL backend, as it is derived from [http://www.cse.unsw.edu.au/~chak/haskell/ctk/ CTKLight]. The stable version has had a bad bug on *-operators around patterns that could accept zero characters. The unstable version will have this issue fixed.<br />
<br />
This library provides no submatch captures, but is very fast at finding the Posix leftmost longest match.<br />
<br />
=== regex-parsec ===<br />
<br />
This backend can either find the left-biased match like Perl or the longest match like Posix. It uses Parsec as a backtracking matching engine and is quite slow.<br />
<br />
The submatches returned in longest match mode maximize the length of the captured texts instead of the subpatterns, and this is a divergence from the Posix specification.<br />
<br />
==Documentation==<br />
<br />
Coming soonish. There is also a great tutorial for using (=~) and (=~~) at [http://www.serpentine.com/blog/2007/02/27/a-haskell-regular-expression-tutorial/ this blog post.]<br />
<br />
A commentary on regex-posix bugs has been started at [[Regex_Posix]] in support of the regex-posix-unittest package.<br />
<br />
A commentary on the design and internals of such Posix engines has been started at [[Regex_TDFA]] mainly describing the regex-tdfa package.<br />
<br />
==Link to article on DFA==<br />
There is an article by Russ Cox [http://swtch.com/~rsc/regexp/regexp1.html on Thompson Non-Finite Automata] which presents the DFA algorithm and shows it to be faster than backtracking (the method used in perl, ruby, python).<br />
<br />
The original version of the above article is slightly incomplete advocacy. I will explain in the (apples|oranges) section below [[User:ChrisKuklewicz|ChrisKuklewicz]] 15:56, 30 January 2007 (UTC)<br />
<br />
== (apple|orange) ==<br />
<br />
As Haskell is all about making data more strongly typed, I want to make the point that there are two popular Types of regular expressions in existence. Just as all String's are not the same Type, such as some may be escaped or encoded versions or hold data printed in different formats, a regular expressions like "a?a" means two different things depending on its actual Type.<br />
<br />
And this difference is very close to another thing that Haskell and functional programming emphasize : declarative programming (what) as opposed to imperative programming (how).<br />
<br />
I will call the two Types of regular expressions ''Posix'' and ''Perl''.<br />
<br />
;Posix Regular Expressions<br />
:This is the declarative approach to regular expressions. The correct match of a regexp to a string of text starts at the leftmost possible position; there are no valid matches that can start before it. Of the possible matches that start at this leftmost position, the one that matches the longest substring is the correct match.<br />
:How this match is found is immaterial to this definition of the correct match.<br />
:Here and for the rest for the rest of this page I mean Posix to be modern "Posix Extended Regular Expressions" and never the older "Posix Basic Regular Expressions".<br />
<br />
;Perl Regular Expressions<br />
:This is the imperative approach to regular expressions. The correct match of a regexp to a string of text starts at the leftmost possible position; there are no valid matches that can start before it. To choose the correct match that starts at this position match parts of the regexp against the text until the first match is found. Specifically you must try left branches of '|' before right branches, and treat '?' '*' and '+' as greedy. Greedy means to match as many iterations, and only backing off the number of repetitions if no complete match is possible.<br />
:The first match found may not be the longest, and it may not be the shortest. It is the left-biased choice.<br />
:This definition of a correct match is identical description of how a backtracking implementation would operate.<br />
<br />
To find a Perl match you usually do what Perl (and Python, and Java, etc.) do, which is try the left branches and greedy repetition first, then backtrack until the first answer is found. The number of path is the multiplication of alternatives. So "a?" repeated 'n' times has 2<sup>n</sup> paths.<br />
<br />
To find a Posix match you must try all possible matches from a starting point and pick the longest, moving on the next starting point if there was not match. If you must backtrack and check each path separately this will take exponential time. To avoid this you construct an automata (NFA or DFA) and then you can check "a?" repeated n times in polynomial time (I think it is O(n<sup>2</sup>) for NFA, O(n) for DFA). You don't have to use an automata; you can write a (slow) Posix engine using backtracking.<br />
<br />
<br />
Perl has an obvious and easy to implement "right-bias" variant that is the mirror image. And Posix has an obvious and easy to implement "shortest match".<br />
<br />
In Posix, declaring you want to match "a|b" is always exactly the same as "b|a". Where in Perl these can be very different instructions.<br />
<br />
When writing a Perl regexp you may want more imperative control. So there are ''lazy'' variants '??' and '+?' and '*?' which are non-greedy and try the fewest number of repetitions first. And at least Java add ''posessive'' variants '?+' and '++' and '*+' that try the largest number of iterations that locally match, but will not backtrack to fewer repetitions; thus removing many possible paths to try from the search space. And Perl introduced ''lookahead'' assertions, where the engine checks if the future text a point matches or fails to match a nested regular expression. After a while these extensions comprise a whole imperative parsing language expressed in the form a single complicated string. So to make it easier to read Perl introduced comments and a whitespace ignoring mode that let you write better documented regular expressions. And it lets you (especially in Perl 6) embed fragments of Perl into the regular expressions. So Perl regexps give you a tremendous amount of power in how to find the match, which translates into power (for the expert) in defining what the match will be.<br />
<br />
With Perl there is a unique first match, so knowing what the parenthesized subgroups matched is easy. In Posix there can several ambiguous ways to match the same longest answer. Two examples:<br />
* "12" =~ "(..)|(.)(.)" could match<br />
** \0 = "12", \1 = "12", \1 = no match, \2 = no match<br />
** \0 = "12", \1 = no match, \1="1", \2 = "2"<br />
* "12" =~ "(.?)*" could match<br />
** \0 = "12", \1 = "" (empty match)<br />
** \0 = "12", \1 = "2"<br />
And the Posix standard requires a left bias in the first example and the non-empty final iteration in the second example. So "a|b" and "b|a" are distinguishable if there are ambiguous ways to match and these ambiguities affect the captured subgroups.<br />
<br />
About the paper in the above section: It never mentions the difference between longest versus leftmost meaning of regular expressions. Replacing the Perl engine with a Posix DFA will simply break many carefully crafted regular expressions. I do not know if the engine Russ Cox touts that was written 20 years ago by Pike implements Posix or Perl semantics. If it finds the longest match (Posix) then it is no help in replacing the default engine in a language like Perl. If it does efficiently find the left-biased match then it would be possible. The comparison to Ville Laurikari's work is not encouraging in this regard since Ville's system finds the longest match and not the left-biased one.<br />
<br />
== Bounded space Posix proposal ==<br />
<br />
For further discussions, Chris has posted his [[/Bounded space proposal|bounded space proposal]] for Posix algorithms. This is also a part of the discussion at a thread on [http://lambda-the-ultimate.org/node/2064 Lambda The Ultimate].<br />
<br />
== Sample benchmark ==<br />
The benchmark consists of matching "Hello world foo=123 whereas bar=456 Goodbye" against ".*foo=([0-9]+).*bar=([0-9]+).*" (as a lazy bytestring) 1mln times (along the lines of length . filter matchesPat . replicate 1000000).<br />
It has been performed on a 2x2GHz P4 machine, compiled with -O2, ghc 6.10.1.<br />
<br />
regex-pcre: 0.02s<br />
<br />
pcre-light: 0.02s<br />
<br />
regex-tdfa 1.0.0 (no compiler bug): 0.03s<br />
<br />
regex-dfa 0.91: 5.4s<br />
<br />
regex-posix: 20s<br />
<br />
regex-tdfa 1.0.0 (most probably compiler bug): 89s<br />
<br />
In all cases, the pattern is compiled only once.<br />
<br />
WARNING: This result of 97s may be the result of a compiler bug that probably may cause the pattern to be compiled at every match invocation; without that bug, regex-tdfa performs extremely well. Please see ticket http://hackage.haskell.org/trac/ghc/ticket/3059 <br />
<br />
UNWARNING: The code for regex-tdfa 1.0.0 is being improved. In particular the slow 89s is probably real. But it need not be so. The upcoming version runs in less than 0.2 seconds.<br />
<br />
Thus, for patterns as simple as this one, it is appropriate to use pcre.<br />
For patterns where the Perl-style strategy of backtracking causes exponential backtracking blowup, it is appropriate to use regex-dfa or regex-tdfa.</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=Implement_a_chat_server&diff=59115Implement a chat server2014-11-23T02:10:57Z<p>Artyom Kazak: typos</p>
<hr />
<div>[[Category:Tutorials]]<br />
[[Category:Code]]<br />
== Introduction ==<br />
This page describes how to implement a simple chat server. The server should support multiple connected users. Messages sent to the server are broadcast to all currently connected users.<br />
<br />
== Trivial server ==<br />
We start with a trivial server.<br />
<haskell><br />
import Network.Socket<br />
<br />
main :: IO ()<br />
main = do<br />
-- create socket<br />
sock <- socket AF_INET Stream 0<br />
-- make socket immediately reusable - eases debugging.<br />
setSocketOption sock ReuseAddr 1<br />
-- listen on TCP port 4242<br />
bindSocket sock (SockAddrInet 4242 iNADDR_ANY)<br />
-- allow a maximum of 2 outstanding connections<br />
listen sock 2<br />
mainLoop sock<br />
<br />
mainLoop :: Socket -> IO ()<br />
mainLoop sock = do<br />
-- accept one connection and handle it<br />
conn <- accept sock<br />
runConn conn<br />
mainLoop sock<br />
<br />
runConn :: (Socket, SockAddr) -> IO ()<br />
runConn (sock, _) = do<br />
send sock "Hi!\n"<br />
sClose sock<br />
</haskell><br />
<br />
This server creates a socket for listening on port 4242, and sends a single<br />
line to everyone who connects.<br />
<br />
== Using System.IO for sockets ==<br />
<hask>System.IO</hask> functions for input and output are much more convenient than those that <hask>Network.Socket</hask> provides. We can turn a <hask>Socket</hask> into a <hask>Handle</hask> as follows:<br />
<br />
<haskell><br />
import System.IO<br />
[...]<br />
runConn (sock, _) = do<br />
hdl <- socketToHandle sock ReadWriteMode<br />
hSetBuffering hdl NoBuffering<br />
hPutStrLn hdl "Hi!"<br />
hClose hdl<br />
</haskell><br />
<br />
== Concurrency ==<br />
So far the server can only handle one connection at a time. This is ok for just writing a message but won't work for a chat server. We can fix this quite easily though, using <hask>forkIO</hask>:<br />
<br />
<haskell><br />
import Control.Concurrent<br />
[...]<br />
mainLoop sock = do<br />
conn <- accept sock<br />
forkIO (runConn conn)<br />
mainLoop sock<br />
</haskell><br />
<br />
== Adding communication between threads ==<br />
This seems to be a hard problem. Luckily, the <hask>Control.Concurrent.Chan</hask> module provides exactly what we need: channels with a single write and multiple read ends. First we decide on a message type. Let's use a string for now:<br />
<br />
<haskell><br />
type Msg = String<br />
</haskell><br />
<br />
<hask>main</hask> will have to create a channel, and pass it to <hask>mainLoop</hask>.<br />
<br />
<haskell><br />
import Control.Concurrent.Chan<br />
[...]<br />
main = do<br />
[...]<br />
chan <- newChan<br />
mainLoop sock chan<br />
</haskell><br />
<br />
<hask>mainLoop</hask> in turn will pass it to <hask>runConn</hask>.<br />
<br />
<haskell><br />
mainLoop :: Socket -> Chan Msg -> IO ()<br />
mainLoop sock chan = do<br />
conn <- accept sock<br />
forkIO (runConn conn chan)<br />
mainLoop sock chan<br />
</haskell><br />
<br />
And finally, <hask>runConn</hask> will duplicate the channel and read from it.<br />
<br />
<haskell><br />
import Control.Monad<br />
import Control.Monad.Fix (fix)<br />
[...]<br />
runConn :: (Socket, SockAddr) -> Chan Msg -> IO ()<br />
runConn (sock, _) chan = do<br />
let broadcast msg = writeChan chan msg<br />
hdl <- socketToHandle sock ReadWriteMode<br />
hSetBuffering hdl NoBuffering<br />
chan' <- dupChan chan<br />
-- fork off thread for reading from the duplicated channel<br />
forkIO $ fix $ \loop -> do<br />
line <- readChan chan'<br />
hPutStrLn hdl line<br />
loop<br />
-- read lines from socket and echo them back to the user<br />
fix $ \loop -> do<br />
line <- liftM init (hGetLine hdl) <br />
broadcast line<br />
loop<br />
</haskell><br />
<br />
Note that <hask>runConn</hask> now actually forks another worker thread for sending messages to the connected user.<br />
<br />
== Cleanups and final code ==<br />
[[Image:chat_server_screenshot.png|thumb|Screenshot :)]]<br />
There are two major problems left in the code. First, the code has a [[memory leak]], because the original channel is never read by anyone. This can be fixed by adding another thread just for that purpose.<br />
<br />
Secondly, closing connections is not handled gracefully at all. This requires exception handling.<br />
<br />
The code below fixes the first issue and mostly fixes the second one, and adds a few cosmetic improvements:<br />
<br />
* messages are not echoed back to the user they came from.<br />
* every connection is associated with a name.<br />
<br />
<haskell><br />
-- with apologies for the lack of comments :)<br />
<br />
import Network.Socket<br />
import System.IO<br />
import Control.Exception<br />
import Control.Concurrent<br />
import Control.Concurrent.Chan<br />
import Control.Monad<br />
import Control.Monad.Fix (fix)<br />
<br />
type Msg = (Int, String)<br />
<br />
main :: IO ()<br />
main = do<br />
chan <- newChan<br />
sock <- socket AF_INET Stream 0<br />
setSocketOption sock ReuseAddr 1<br />
bindSocket sock (SockAddrInet 4242 iNADDR_ANY)<br />
listen sock 2<br />
forkIO $ fix $ \loop -> do<br />
(_, msg) <- readChan chan<br />
loop<br />
mainLoop sock chan 0<br />
<br />
mainLoop :: Socket -> Chan Msg -> Int -> IO ()<br />
mainLoop sock chan nr = do<br />
conn <- accept sock<br />
forkIO (runConn conn chan nr)<br />
mainLoop sock chan $! nr+1<br />
<br />
runConn :: (Socket, SockAddr) -> Chan Msg -> Int -> IO ()<br />
runConn (sock, _) chan nr = do<br />
let broadcast msg = writeChan chan (nr, msg)<br />
hdl <- socketToHandle sock ReadWriteMode<br />
hSetBuffering hdl NoBuffering<br />
hPutStrLn hdl "Hi, what's your name?"<br />
name <- liftM init (hGetLine hdl)<br />
broadcast ("--> " ++ name ++ " entered.")<br />
hPutStrLn hdl ("Welcome, " ++ name ++ "!")<br />
chan' <- dupChan chan<br />
reader <- forkIO $ fix $ \loop -> do<br />
(nr', line) <- readChan chan'<br />
when (nr /= nr') $ hPutStrLn hdl line<br />
loop<br />
handle (\(SomeException _) -> return ()) $ fix $ \loop -> do<br />
line <- liftM init (hGetLine hdl)<br />
case line of<br />
"quit" -> hPutStrLn hdl "Bye!"<br />
_ -> do<br />
broadcast (name ++ ": " ++ line)<br />
loop<br />
killThread reader<br />
broadcast ("<-- " ++ name ++ " left.")<br />
hClose hdl<br />
</haskell><br />
<br />
Have fun chatting!</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=Functor&diff=58993Functor2014-10-13T21:14:39Z<p>Artyom Kazak: fix broken link</p>
<hr />
<div>{{Standard class|Functor|module=Data.Functor|module-doc=Data-Functor|package=base}}<br />
The '''Functor''' class is defined like this:<br />
<br />
class Functor f where<br />
fmap :: (a -> b) -> f a -> f b<br />
<br />
It is available in the Prelude, but defined in Data.Functor.<br />
<br />
All instances of Functor should obey:<br />
<br />
fmap id = id<br />
fmap (p . q) = (fmap p) . (fmap q)<br />
<br />
<br />
== More reading == <br />
<br />
* [http://www.haskellforall.com/2012/09/the-functor-design-pattern.html The functor design pattern]<br />
<br />
<br />
<br />
[[Category:Functor|*]]</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=The_JavaScript_Problem&diff=57952The JavaScript Problem2014-04-27T01:03:38Z<p>Artyom Kazak: superscripts and is→are</p>
<hr />
<div>== The problem ==<br />
<br />
The JavaScript problem is two-fold and can be described thus:<br />
<br />
# '''JavaScript sucks.''' The depths to which JavaScript sucks are well-documented and well-understood. Its main faults are: lack of module system, weak-typing, verbose function syntax<sup>1</sup>, late binding<sup>2</sup>, which has led to the creation of various static analysis tools to alleviate this language flaw<sup>3</sup>, but with limited success<sup>4</sup> (there is even a static type checker<sup>5</sup>), finicky equality/automatic conversion, <code>this</code> behaviour, and lack of static types.<br />
<br />
# '''We need JavaScript.''' Using it for what it is good for, i.e. providing a platform for browser development, but not using the language ''per se'', is therefore desirable, and many are working to achieve this, in varying forms. There are various ways to do it, but we ought to opt for compiling an existing language, Haskell, to JavaScript, because we do not have time to learn or teach other people a new language, garner a new library set and a new type checker and all that Haskell implementations provide.<br />
<br />
== Mainstream alternatives ==<br />
<br />
=== Coffeescript ===<br />
It makes many aspects of Javascript sane and convenient, and you get a compilation check that verifies syntax, however it still suffers greatly from weak-typing.<br />
<br />
=== TypeScript ===<br />
<br />
Structural typing with traditional generics on top of Javascript.<br />
Of all the alternatvies, TypeScript's advantage is that it makes no changes to Javascript. Existing javascript code that passes jshint is valid Typescript code. TypeScript also adds features from the latest javascript standards that it can compile down to older versions of javascript. TypeScript is by far the easiest javascript variant to learn. The downside is that one might desire a better language then just javascript + types.<br />
<br />
TypeScript defaults to dynamic typing when it can't figure the type out. However, it now has a noImplicitAny setting that will give a compilation error if it can't figure out the type.<br />
<br />
Structural sub-typing seems a good fit for JavaScript. <br />
Perhaps Typescript's biggest problem is that null is a valid value for any type.<br />
<br />
== Haskell -> JS ==<br />
<br />
=== UHC ===<br />
<br />
Original blog post [https://github.com/atzedijkstra/javascript-runtime-for-UHC here.] Quickstart guide [http://chrisdone.com/posts/2012-01-06-uhc-javascript.html here.] A more in-depth discussion about the current capabilities of the backend [http://www.norm2782.com/improving-uhc-js-report.pdf here.] For an example of using the JavaScript compilation for a real app see this [http://alessandrovermeulen.me/2012/01/26/getting-rid-of-javascript-with-haskell/ blog post], there is also a port of wxAsteroids to the browser (see [http://uu-computerscience.github.io/js-asteroids/ github] or a [http://www.rubendegooijer.nl/posts/2013-04-06-haskell-oop.html blog post]). <br />
<br />
* Beta.<br />
* Only works for UHC, but promising. <br />
* UHC compiles enough of Hackage to be very useful.<br />
* Doesn't produce an explosion of code, seemingly.<br />
* Fairly substantial JS/DOM/W3C/HTML5 API.<br />
* Currently works.<br />
<br />
=== Fay ===<br />
<br />
Website: http://fay-lang.org/ Discussion on Reddit: [http://www.reddit.com/r/haskell/comments/11yrpi/fay_slides/ Fay slides]. The package is on [http://hackage.haskell.org/package/fay Hackage]. Fetch with Git: <br />
git clone git://github.com/faylang/fay.git<br />
<br />
* Compiles a subset of Haskell, needs more<br />
* Currently works.<br />
<br />
=== GHCJS ===<br />
<br />
The Github page is [https://github.com/ghcjs/ghcjs here.]<br />
<br />
* Alpha.<br />
* Works.<br />
* Incomplete.<br />
* Nicely designed.<br />
* Compiles most pure Haskell libraries no problem.<br />
* FFI to JS works, and the author, sviperll is a helpful guy.<br />
<br />
=== Haste ===<br />
<br />
[http://haste-lang.org Website], [http://hackage.haskell.org/package/haste-compiler Hackage]<br />
<br />
* Seamless, type-safe single program framework for client-server communication<br />
* Easy Javascript interoperability<br />
* Generates small, fast, minifiable code.<br />
* Lightweight concurrency, Cabal integration, FFI and GHC extensions supported.<br />
* Cross platform.<br />
* Works.<br />
<br />
=== [[JMacro]] ===<br />
<br />
On the Haskell wiki (see above) and on [http://hackage.haskell.org/package/jmacro hackage]<br />
<br />
* Mature, Maintained<br />
* Not Haskell but an EDSL _in_ Haskell nonetheless.<br />
* JMacro Panels provides a purely haskell combinator library that generates dynamically updating html and js with asynchronous client-server communication.<br />
* Syntax is a fusion of Haskell and JavaScript<br />
* Untyped, but with syntactic correctness (at least) enforced at compile-time.<br />
* Embeddable through quasi-quoting<br />
* Support for various forms of code-generation<br />
<br />
=== Others ===<br />
<br />
* [https://github.com/johang88/haskellinjavascript Haskell interpreter in JS] — An interpreter. Haven't tried but is apparently dead.<br />
* YHC JS backend — Beta-ish. Apparently works, but I was unable to compile YHC, so haven't tried yet. I would be interested in anyone's experience using it. There's [http://www.haskell.org/haskellwiki/Yhc/Javascript an old wiki page] about Yhc's JavaScript support, but Yhc itself is a dead project.<br />
* Emscripten — not Haskell→JS, but compiles LLVM/Clang output to JavaScript. Could possibly be used for GHC→LLVM→JS compiling, which I tried, and works, but would have to also compile the GHC runtime which is not straight-forward (to me) for it to actually run. <br />
* hjscript — Beta. EDSL, not Haskell→JS. Works. Not ''very'' annoying to program in, but is JS semantics, not Haskell. Hackage package [http://hackage.haskell.org/package/HJScript here.]<br />
* Some have also tried writing a Haskell→JS compiler to make a more direct JS-aware translation of code (to not have huge code output a la GHCJS, YHC, Emscripten).<br />
<br />
<br />
== FP -> JS ==<br />
<br />
=== Ur/Web ===<br />
<br />
http://www.impredicative.com/ur/<br />
<br />
Perhaps the problem with Ur is that they are selling both a backend and a frontend together. Being a new language, the backend is lacking in libraries to be practical for many tasks. However, there is an RSS reader that is using Ur for the front-end and Haskell for the backend: https://bazqux.com/<br />
<br />
<br />
=== OCaml ===<br />
<br />
The OCaml -> JS compiler is supposed to be good, it is now used at facebook for an internal in-browser code editor.<br />
http://ocsigen.org/js_of_ocaml/<br />
<br />
=== GorillaScript ===<br />
<br />
http://ckknight.github.io/gorillascript/<br />
<br />
immutable by default, global type inference, macros, what coffeescript should have been. The syntax is similar to coffeescript<br />
<br />
=== Roy, PureScript ===<br />
<br />
[http://roy.brianmckenna.org/ Roy]: meld JavaScript semantics with functional languages. Experimental, but has many bread-and-butter Haskell features.<br />
Roy is written in JS. PureScript is written in Haskell.<br />
<br />
=== Idris ===<br />
<br />
Idris is a compiled language with dependent types, implemented in Haskell, with backends for both LLVM and JavaScript. Experimental.<br />
<br />
* Full dependent types with dependent pattern matching where clauses, with rule, simple case expressions, pattern matching let and lambda bindings<br />
* Dependent records with projection and update<br />
* Type classes<br />
* Monad comprehensions<br />
* Syntactic conveniences for lists, tuples, dependent pairs do notation and idiom brackets<br />
* Indentation significant syntax<br />
* Extensible syntax<br />
* Tactic based theorem proving (influenced by Coq)<br />
* Cumulative universes<br />
* Totality checking<br />
* Simple foreign function interface (to C)<br />
* Hugs style interactive environment<br />
<br />
Links:<br />
* [http://idris-lang.org/ Website idris-lang.org] <br />
* [[Dependent_type|Dependent Type in haskell wiki]]<br />
* [http://en.wikipedia.org/wiki/Dependent_type WP (en) Dependent type] (with Idris listed under language comparison)<br />
<br />
<br />
== Links ==<br />
<br />
* [https://github.com/yesodweb/yesod/wiki/Javascript-Options Yesod - Javascript Options]<br />
* [http://chrisdone.com/tags/javascript Chris Done Blog] - Tag: Javascript<br />
<br />
== Footnotes ==<br />
<br />
# Its support for closures is commonly noted as being one of JavaScript’s redeeming features.<br />
# Early binding allows for static verification of the existence of method-signature pairs (e.g. v-tables). Late binding does not give the compiler (or an IDE) enough information for existence verification, it has to be looked up at run-time.<br />
# There are several hinting libraries, which developers insist are indispensable tools when developing JavaScript seriously, such as JavaScript lint, JSLint, and JSure.<br />
# “Any non-trivial analysis is very difficult due to Javascript’s dynamic nature.” — Berke Durak, Ph.D., author of jsure.<br />
# Google Inc. thought it necessary to develop a compiler, Google Closure, which does type-checking and limited inference.<br />
<br />
<br />
[[Category:Web|*]]</div>Artyom Kazakhttps://wiki.haskell.org/index.php?title=Cookbook/Dates_And_Time&diff=57388Cookbook/Dates And Time2014-01-12T14:33:41Z<p>Artyom Kazak: </p>
<hr />
<div>== Finding today's date ==<br />
<br />
<haskell><br />
import Data.Time<br />
<br />
c <- getCurrentTime --> 2009-04-21 14:25:29.5585588 UTC <br />
(y,m,d) = toGregorian $ utctDay c --> (2009,4,21)<br />
</haskell><br />
<br />
== Adding to or subtracting from a date ==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Problem<br />
! Solution<br />
! Examples<br />
|-<br />
| adding days to a date<br />
| [http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Calendar.html#v%3AaddDays addDays]<br />
|<haskell><br />
import Data.Time<br />
a = fromGregorian 2009 12 31 --> 2009-12-31<br />
b = addDays 1 a --> 2010-01-01<br />
</haskell><br />
|-<br />
| subtracting days from a date<br />
| [http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Calendar.html#v%3AaddDays addDays]<br />
|<haskell><br />
import Data.Time<br />
a = fromGregorian 2009 12 31 --> 2009-12-31<br />
b = addDays (-7) a --> 2009-12-24<br />
</haskell><br />
|}<br />
<br />
== Difference of two dates ==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Problem<br />
! Solution<br />
! Examples<br />
|-<br />
| calculating the difference of two dates<br />
| [http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Calendar.html#v%3AdiffDays diffDays]<br />
|<haskell><br />
import Data.Time<br />
a = fromGregorian 2009 12 31 --> 2009-12-31<br />
b = fromGregorian 2010 12 31 --> 2010-12-31<br />
diffDays b a --> 365<br />
</haskell><br />
|}<br />
<br />
== CPU time ==<br />
Use [http://www.haskell.org/ghc/docs/latest/html/libraries/base/System-CPUTime.html#v%3AgetCPUTime System.CPUTime.getCPUTime] to get the CPU time in picoseconds.<br />
<br />
You can time a computation like this<br />
<haskell><br />
getCPUTimeDouble :: IO Double<br />
getCPUTimeDouble = do t <- System.CPUTime.getCPUTime; return ((fromInteger t) * 1e-12)<br />
<br />
main = do<br />
t1 <- getCPUTimeDouble<br />
print (fib 30)<br />
t2 <- getCPUTimeDouble<br />
print (t2-t1)<br />
</haskell></div>Artyom Kazak