Difference between revisions of "PermissiveImportsProposal"

Revision as of 18:13, 23 October 2014

Description

In Haskell we are able to import multiple identifiers into the same namespace, but this can't generally be taken advantage within a module. If we have "import Data.Text; import Data.Text.Lazy" in our imports and attempt to use isInfixOf we will get an ambiguous occurrence. However, there is one case where the usage can be determined unambiguously: when an identifier has been imported explicitly by name. Take the following example:

import Data.Text.Lazy.IO (putStrLn) 
import Prelude 
import qualified Data.List as L 
import qualified Data.Text.Lazy as L (isInfixOf)

Throughout this proposal we will refer to the putStrLn/isInfixOf imports as being explicit, while the others are implicit. I would argue any uses of putStrLn and L.isInfixOf in the code following this import statement are unambiguous, and the compiler should treat them as such. There's no way to take advantage of the current behaviour; it just causes fewer programs to be accepted than are necessary, and given we've explicitly asked for the identifiers from specific modules there's no reason for the compiler to assume we might want them from the Prelude or Data.List.

Advantages

Currently using more polymorphic functions (e.g. IO functions lifted into MonadIO) or differently monomorphic functions (e.g. Text instead of String-based) requires a certain amount of friction. In these cases it is common to use the same identifiers for the same functionality, but this requires us to hide those same identifiers from any modules which export clashing names. When we've already already explicitly imported those identifiers in the first place these hiding clauses are redundent, and amount to code duplication. Plus, if in the course of refactoring we later decide we no longer need the explicitly imported functions we should probably remove the now-unnecessary hiding clause, which could easily be missed if the programmer isn't conscientious enough, and could cause confusion if they're left in.

The other option is to use re-designed modules that don't export the identifiers you don't want to use. This is common in the case of the Prelude, where there are many alternate Preludes with a different set of exported functions. Using these incurs an extra dependency, and doesn't solve the issue if you later decide you need more general functions again. There are legitimate reasons not to want the Prelude to be overly polymorphic, as it can complicate error messages. With this proposal libraries could be more conservative with how much polymorphism they include, and users could still easily take advantage of the more polymorphic functions without the hiding clauses when they need to, still having the less generalized functions when they don't need the extra power. This should reduce the need to expose too much polymorphism in the Prelude (however much that happens to be), and also make it easier for a larger number of people to experiment with changes that might be considered in the future to see how well those changes work in practice.

This would also have the advantage when implicitly importing from a module that is updated to export new functions. If you are already using functions with the same identifier this will cause an ambiguous occurrence error. I did read in a discussion about moving <$>/<*> into the Prelude that this was a concern, as it would essentially steal the symbol from other users <-- looking for link -->. Note that the PVP also allows minor version bumps to include new identifiers, though I don't think this is a common problem in practice.

Possible issues

The current behaviour encourages a sort of measure-twice-cut-once mentality. If we've explicitly imported a function by mistake and are expecting the implicitly imported version to be used we would normally get an ambiguous occurrence, whereas when using this we would either get a type error or no error if the mistakenly imported function is type correct. I'm not sure if this would be an issue in practice, as it requires you to be importing a function with the same name, same type, and different semantics, which I don't think is common. Likewise, it relies on having mistakes in your import list, which can already cause bugs even with the current behaviour. However, it is still more permissive than currently. Import statements would also be slighly less local: if I see "import Data.List" at the beginning of the import list and I miss the "import Data.Text (isInfixOf)" at the other end of the list I could mistakenly think uses of isInfixOf in the module use the list version, when they are really using the Text version. These issues might be more common in large projects with a large number of imports and contributors. At least with a pragma it's opt-in, and easy to change if a project later decides the redundancy is useful.

Implementation

I don't have enough knowledge of GHC's internals to implement this, so it's a feature request. However, there are two quite different ways to implement this that occur to me, so I may as well list them. One is automatically hiding imports that have been explicitly imported. So the example given above would become:

import Data.Text.Lazy.IO (putStrLn) 
import Prelude hiding (putStrLn, isInfixOf)
import qualified Data.List as L hiding (putStrLn, isInfixOf)
import qualified Data.Text.Lazy as L (isInfixOf) 

import qualified Prelude (putStrLn)
import qualified Data.List (isInfixOf)

We first do a pass through the import list to see what has been explicitly imported, then do another pass to add/append hiding clauses to anything bringing an unspecified number of identifiers into scope. Note it is not an error to hide an identifier that isn't being exported, so we aren't required to know which modules export what identifiers to do this. However, this would prevent us from calling Prelude.putStrLn explicitly, so we also need to add an explicit qualified import of any identifiers that have been hidden to not disallow those uses (unless the user had already manually hidden them). There's also the question of how to resolve

import Data.Either (Either(..))
import MyModule (Left)

That would suggest we should import all of Either's data constructors but Left, but there's no way to do something like "import Data.Either (Either(..)) hiding (Left)". I don't think there's any way to resolve that without knowing what Data.Either exports.

I think the better implementation is automatically promote anything that has been imported explicitly to a qualified use, provided only one identifier with the name has been explicitly imported. So with the example again:

import Data.Text.Lazy.IO (putStrLn) 
import Prelude 
import qualified Data.List as L 
import qualified Data.Text.Lazy as L (isInfixOf)

any unshadowed use of putStrLn is automatically changed to Data.Text.Lazy.IO.putStrLn, and any unshadowed use of L.isInfixOf is changed to Data.Text.Lazy.isInfixOf. If multiple identifiers with the name putStrLn had been explicitly imported no promotion would occur. This should also cover the Data.Either example, would still let you use the non-explicitly imported identifiers by qualifying their use, and would probably be simpler to implement in GHCi (no need to retroactively hide identifiers after an explicit import). The only tricky case is:

import Data.Text.Lazy.IO (putStrLn) 
import Prelude as Data.Text.Lazy.IO

In that case putStrLn should refer to the text version, but changing it to Data.Text.Lazy.IO.putStrLn if anything makes it more ambiguous. If we wanted to resolve this I'm sure GHC can disambiguate even if the user can't at the source level. Granted, I don't think anyone would lose sleep if this still gave an ambiguous identifier warning. Really just depends where in GHCs pipeline the change is implemented I guess.

Similar requests

There have been similar calls for a sort of unambiguous generalizing functions feature <-- looking for link-->: if a function and a more general version of that same function are both imported the more general version is used. This proposal would largely handle that use case, while posing a smaller engineering challenge. As long as the user has explicitly imported the more general version they get it, not because it's more general but because the user was explicit. This would also put less strain on library writers, who wouldn't be required to including rules specifying which functions generalize which.

This proposal wouldn't require changing any existing library code, nor would its implementation potentially require access to type information. I don't think it should require changing the behaviour of the actual ambiguous occurrence resolution stuff either, provided the transformation is applied before the ambiguous occurrence checks.

https://ghc.haskell.org/trac/ghc/ticket/9702 is also related.

@@ Line 19: / Line 19: @@
 The other option is to use re-designed modules that don't export the identifiers you don't want to use. This is common in the case of the Prelude, where there are many alternate Preludes with a different set of exported functions. Using these incurs an extra dependency, and doesn't solve the issue if you later decide you need more general functions again. There are legitimate reasons not to want the Prelude to be overly polymorphic, as it can complicate error messages. With this proposal libraries could be more conservative with how much polymorphism they include, and users could still easily take advantage of the more polymorphic functions without the hiding clauses when they need to, still having the less generalized functions when they don't need the extra power. This should reduce the need to expose too much polymorphism in the Prelude (however much that happens to be), and also make it easier for a larger number of people to experiment with changes that might be considered in the future to see how well those changes work in practice.
-This would also have the advantage when implicitly importing from a module that is updated to export new functions. If you are already using functions with the same identifier this will cause an ambiguous occurrence error. I did read a in discussion about moving <$>/<*> into the Prelude that this was a concern, as it would essentially steal the symbol from other users <-- looking for link -->. Note that the PVP also allows minor version bumps to include new identifiers, though I don't think this is a common problem in practice.
+This would also have the advantage when implicitly importing from a module that is updated to export new functions. If you are already using functions with the same identifier this will cause an ambiguous occurrence error. I did read in a discussion about moving <$>/<*> into the Prelude that this was a concern, as it would essentially steal the symbol from other users <-- looking for link -->. Note that the PVP also allows minor version bumps to include new identifiers, though I don't think this is a common problem in practice.
 ==Possible issues==