75
$\begingroup$

I have two $M \times K$ arrays $L, T$ where I would like to set all the elements in $L$ to zero whenever the corresponding element of $T$ is greater than 15. The Select[] function is close but not quite what I need since it returns elements rather than indexes. In addition, even presuming I can get that list of indexes $i,j$ where $T_{ij}>15$ I do not know how to efficiently use them to set the corresponding elements $L_{ij} \rightarrow 0$.

What facilities, if any, does Mathematica have for this kind of advanced indexing? I come from the world of R, Matlab and Numpy where it is commonly used, as in

  L[T>15]=0
$\endgroup$
0

6 Answers 6

49
$\begingroup$

Using a little Mathematica pattern matching, I think you can get similar performance as @Szabolcs's answer while having nice Matlab-style syntax:

replaceWhere[cond_, selectTrue_, selectFalse_] := 
   With[{evaluatedCondition = evaluateTensorCondition[cond]},
        selectTrue*evaluatedCondition + selectFalse*(1 - evaluatedCondition)]

replaceWhere[cond_, selectTrue_] := replaceWhere[cond, selectTrue, 0]

evaluateTensorCondition[a_ > b_] := UnitStep[a - b]    

evaluateTensorCondition[a_ < b_] := UnitStep[b - a]

evaluateTensorCondition[a_ && b_] := 
    evaluateTensorCondition[a]*evaluateTensorCondition[b]

evaluateTensorCondition[a_ || b_] := 
    UnitStep[evaluateTensorCondition[a]+evaluateTensorCondition[b]-1]

(implementing more comparison and logical operators should be straightforward)

Usage:

replaceWhere[T > 15, L, 0]

returns a new list, where every index that is greater than 15 in T is set to 0 , the other elements are the same as L. And it should have similar performance as @Szabolcs's answer, because all the conditions are just transformed into vectorizable tensor operations. (I haven't timed it, though.)

Note that this works for arbitrarily complex expressions, and the last two parameters can be tensors or scalars:

replaceWhere[(T > 5 && T < 15) || Sin[L]<Cos[T], L, L*2]

If you evaluate this expression symbolically:

Clear[T, L]
replaceWhere[(T > 5 && T < 15) || Sin[L] < Cos[T], L, 0]

you get one long tensor expression composed of UnitStep's

Out = L UnitStep[-1 + UnitStep[15 - T] UnitStep[-5 + T] + UnitStep[Cos[T] - Sin[L]]]

So "under the hood" Mathematica will do the same operations as in @Szabolcs's answer. You just get a more readable syntax.


If you don't want to do all those fancy syntax manipulations, you can always use MapThread to combine higher dimensional arrays in any way you'd like. For your question, this would be:

MapThread[If[#1 > 15, 0, #2] &, {T, L}, 2]

This takes the 2d-Tensors (last argument 2) T and L (second argument {T, L}), passes the corresponding elements in the two tensors to a function (first argument If[#1 > 15, 0, #2] &) and returns a new tensor containing the return value of that function. This is slower than @Szabolic's answer, but it is also a lot more flexible.

$\endgroup$
8
  • $\begingroup$ You seem to be doing it properly, so you could add the third numerical argument like evaluateTensorCondition[cond]*(data - replaceWith) + replaceWith for the special case when it's not zero $\endgroup$
    – Rojo
    Commented Mar 11, 2012 at 23:08
  • 1
    $\begingroup$ I think it would be valuable if you edited back the original MapThread approach you posted. It was one of the reasons I voted for this post. While a complex custom function is interesting, I think it is important to mention MapThread here as an important way to combine higher dimensional arrays. Don't forget that the OP is new to Mathematica. $\endgroup$
    – Szabolcs
    Commented Mar 11, 2012 at 23:32
  • $\begingroup$ @Szabolcs: I thought the answer was getting too long... But I can add the MapThread version at the end, for completeness $\endgroup$ Commented Mar 11, 2012 at 23:35
  • $\begingroup$ Switched my acceptance to this answer, as it appears to be a very useful generalization. $\endgroup$
    – Brian B
    Commented Mar 12, 2012 at 18:08
  • $\begingroup$ @nikie: you are implicitly assuming that the elements of evaluatedCondition are either 0 or 1. This is not always the case (see the code for the || part). I guess the code should be modified slightly not to produce strange result in the case that the condition invokes an Or statement. $\endgroup$
    – Fabian
    Commented Mar 12, 2012 at 19:42
63
$\begingroup$

Update: This is now available as the BoolEval package.


I recommend using

L UnitStep[15 - T]

for good performance.


To answer your question about boolean indexing:

When you write L[T > 15] = 0 in MATLAB, T > 15 evaluates to a boolean matrix of 0s and 1s, which can be used as a special selector index in assignments, as you showed (I am writing this for those Mathematica users who don't know this feature of MATLAB).

This feature is not available directly in Mathematica.

However, it is important to know that Mathematica is usually most effective when using it a functional way, i.e. using operations that do not have side effects. Code using heavy assignments is typically not the best/fastest way to approach a problem.

There is a function, Pick, which is similar (but not identical) to this MATLAB feature. Pick[a, b] will return those elements of a for which the corresponding element of b is True. Pick[a, b, 1] allows using 1 instead of True. a and b can be arbitrary dimensional tensors or even ragged lists.

That said, there is no direct equivalent of L[T > 15] = 0. However, anything that can be accomplished with this syntax in MATLAB should be doable using arithmetic operations between tensors (as in my example). This will buy you the performance advantage of vectorization, but it'll still be a bit slower than direct assignment.


Update:

This is a set of functions to help working in this paradigm:

greatereq[a_, b_] := UnitStep[a - b]
lesseq[a_, b_] := UnitStep[b - a]
greater[a_, b_] := 1 - lesseq[a, b]
less[a_, b_] := 1 - greatereq[a, b]
unequal[a_, b_] := Unitize[a - b]
equal[a_, b_] := 1 - unequal[a, b]

equal[a_, b_, c__] := equal[a, b] equal[b, c]
less[a_, b_, c__] := less[a, b] less[b, c]
greater[a_, b_, c__] := greater[a, b] greater[b, c]
lesseq[a_, b_, c__] := lesseq[a, b] lesseq[b, c]
greatereq[a_, b_, c__] := greatereq[a, b] greatereq[b, c]

unequal[a__] := Times @@ (unequal @@@ Subsets[{a}, {2}])

ClearAll[booleval]
booleval[condition_] := 
 Unevaluated[condition] /. {arr_?ArrayQ :> arr, Less -> less, Greater -> greater, 
   LessEqual -> lesseq, Equal -> equal, Unequal -> unequal, 
   GreaterEqual -> greatereq, Or -> Composition[Unitize, Plus], 
   And -> Times, Not -> (1 - # &), True -> 1, False -> 0}

ClearAll[pick]
pick[array_, condition_] :=
 Pick[array,
  booleval[condition],
  1]

booleval will take a set of (in)equalities over arrays and will evaluate them to obtain an array of Boolean values (0 or 1) while making use of vectorization and keeping packed arrays packed. pick will extract the part of an array corresponding to 1s. This will work similarly to MATLAB.

Example:

a = Range[10];
pick[a, a > 5 || a^2 <= 10]

(* ==> {1, 2, 3, 6, 7, 8, 9, 10} *)

The advantage is the convenience of using standard notation for inequalities instead of a cryptic UnitStep.

The OP's example would be written as L booleval[T <= 15].

$\endgroup$
10
  • $\begingroup$ Have you considered using Subtract for performance in the package code? $\endgroup$
    – Mr.Wizard
    Commented Jul 11, 2016 at 8:17
  • $\begingroup$ @Mr.Wizard Good point, thank you! $\endgroup$
    – Szabolcs
    Commented Jul 11, 2016 at 8:47
  • $\begingroup$ By the way this may be relevant to you: (1922) $\endgroup$
    – Mr.Wizard
    Commented Jul 11, 2016 at 8:51
  • 1
    $\begingroup$ There are some condition where Select works better than UnitStep based solution. Notice that Select based solution can utilize the fact that True || a series of stuffs will always be True and False && a series of stuffs will always be False. so there's no need for evaluation of "a series of stuffs". While UnitStep based solution cannot easily use this fact. so when tests are significantly more than elements in list, performance can reverse. $\endgroup$
    – Wjx
    Commented Feb 17, 2017 at 9:10
  • 2
    $\begingroup$ @Wjx Vectorization is always fast, but the code is hard to read and write (in Mathematica, not e.g. in MATLAB). With this code, and later with BoolEval, I tried to make it easy to read/write as well. Sadly, some bugs prevent me from doing this. Try e.g. arr = RandomReal[1, 100]; On["Packing"], then evaluate arr > 0.5. It unpacks. Why does it do that? I don't know. It doesn't even evaluate! Maybe it's a bug. Last time I complained to WRI, the support person did not understand my point and kept saying that arr < 0.5 is not correct syntax. I know, but that's no excuse for unpacking. $\endgroup$
    – Szabolcs
    Commented Feb 17, 2017 at 9:21
41
$\begingroup$
L=ReplacePart[L, Position[T, i_/;i>15]->0]

Go with @Szabolcs answer whenever you can

$\endgroup$
2
  • 6
    $\begingroup$ This will work for more complex tests, where the UnitStep approach may not apply. $\endgroup$ Commented Mar 11, 2012 at 21:22
  • $\begingroup$ I like the UnitStep, ReplacePart solutions a lot. Some of the other solutions are very complicated and sure to turn off people who are new to Mathematica. .... I think I finally figured out how and when to 'Add Comment' in this forum :) $\endgroup$
    – Ted Ersek
    Commented May 13, 2012 at 0:32
20
$\begingroup$

Szabolcs's answer using UnitStep is the right one for your specific case. It is also the most efficient by far: see the timing results below. Explicit Map operations and size comparisons do seem to be particularly slow.

However there is a Boole function that replicates the Matlab approach of returning 1 and 0 instead of True and False, which might be useful for other specific cases.

The other two examples I show here are roundabout ways of getting the same answer that Szabolcs did using UnitStep (once you multiply by ll), but can be adapted to cases where the UnitStep version does not work for whatever reason.

In[2]:= ll = RandomInteger[{1, 100}, {20, 20}];
tt = RandomInteger[{1, 100}, {20, 20}];  


In[19]:= Do[Boole[Map[# < 15 &, tt, {2}]];, {1000}] // Timing

Out[19]= {0.375, Null}

In[20]:= Do[UnitStep[15 - tt];, {1000}] // Timing

Out[20]= {0.016, Null}

In[25]:= Do[Boole[Positive[15 - tt]];, {1000}] // Timing

Out[25]= {0.11, Null}

$\endgroup$
13
$\begingroup$

I'm actually writing a simple package similar to the Pandas package in Mathematica right now. I've not made it to be efficient at all, but maybe I should try putting it up on github if anyone is interested.

You can redefine the Part expression for a custom data structure. Here I have defined the Part function for a expression with the head pandasSeries.

pandasSeries /: 
 Part[ps_pandasSeries, 
  boolps_pandasSeries /; VectorQ[Data@boolps, BooleQ]] := 
 pandasSeries[Pick[Data@ps, Data@boolps], 
  If[Index@ps =!= Null, Pick[Index@ps, Data@boolps]], Name@ps]

Where Data and Index are just accessory functions for the data structure. Anyway once you redefine comparison operators on the data structure to thread over it, you can run expressions like:

ps1[[ps1 < 3]]

The real difficulty is defining what every function already written to work with lists should do for a new data structure. So far my solution has been to do something like:

pandasSeries /: 
  func_[fore___, ps_pandasSeries, after___] /; 
   Not@MatchQ[func, noItFuncs] :=

  With[{res = func[fore, Data@ps, after]},
   If[Head[res] === List, pandasSeries[res, Index@ps, Name@ps], res]];

The idea is simply to define that functions that work on lists can work on the new data structure by extracting its data and running on that data, then using the constructor for the data structure. It works okay for the most part, but I didn't write it for speed. It´s really supposed to be more a proof-of-concept.

$\endgroup$
2
  • $\begingroup$ Have you kept up with this project at all? I would be interested in seeing the github if it exists. $\endgroup$
    – Gabriel
    Commented Dec 7, 2012 at 21:51
  • 1
    $\begingroup$ Haven't kept up. I'm using the pandas library in python for a lot of this kind of stuff recently, so it doesn't matter. The library is completely useless except as an exercise, but here it is: github.com/searke/Pandas-Package $\endgroup$
    – Searke
    Commented Dec 9, 2012 at 22:56
10
$\begingroup$

Here's my solution based on the syntax of Rojo, it works for most syntaxes of Part (maybe all). All dimensions corresponding to the indices where the test function and All are present are considered in the data selection. The value argument can either have the dimension of the x list or of the list corresponding to non-integer positions.

Warning

The code works by adding an UpValue to Part for the value assignment case, and an UpValue to Function for retrieving values satisfying a condition. These are non local changes meant to be experimental, I haven't tested this function extensively, so please use with caution.

If you want to use the functions below in a safe way, see the answer to Alternative to overloading Set.

First some usage example

x=Table[2,{3},{2},{3},{2}];
x[[All,1,EvenQ[#] &,All]]=6
x

y=x[[All,1,All,All]]*2
x[[All,1,EvenQ[#] &,All]]=y
x

x[[1;;,1,EvenQ[#] &,All]]=6
x[[1 ;;, 1, All, All]]
x[[1;;,1,EvenQ[#] &,All]]

(*The parts of another symbol than x can be accessed using Sequence@@#2  
 as the condition is applied in a MapIndexed*)
x[[1;;,1,EvenQ[y[[Sequence @@ #2]]]&,All]]=6

The code

Unprotect[Part];
UpValues[Part]={};
Part/:Set[Part[x_,leftIndices:((_Integer|All|_List|_Span)...),cond_Function(*|cond_Symbol/;!(ListQ[cond]||cond===All)*),rightIndices:((_Integer|All|_List|_Span)...)],value_]:=
    part[x,{leftIndices},cond,{rightIndices},value];
Protect[Part];

Unprotect[Function];
UpValues[Function]={};
Function/:Part[x_,leftIndices:((_Integer|All|_List|_Span)...),cond_Function(*|cond_Symbol/;!(ListQ[cond]||cond===All)*),rightIndices:((_Integer|All|_List|_Span)...)]:=
    part[x,{leftIndices},cond,{rightIndices}];
Protect[Function];

SetAttributes[part,HoldFirst];
part[x_,{leftIndices:((_Integer|All|_List|_Span)...)},cond_Function(*|cond_Symbol/;!(ListQ[cond]||cond===All)*),{rightIndices:((_Integer|All|_List|_Span)...)},value_:None]:=
    Module[{xDimensions,indexSpanRange,allIndices,matchedSubPositions,matchedPositions,valuesExtracted,nonIntegerPositions, valuesReturned},

        allIndices={leftIndices,Sequence@@ConstantArray[All,Length@Dimensions@x-Length@{leftIndices}-Length@{rightIndices}],rightIndices};
        matchedSubPositions=Position[MapIndexed[cond,x[[Sequence@@allIndices]],{-1}],True,{-1}];

        nonIntegerPositions=Position[allIndices/.{_Span->All},All|_List] // Flatten;

        If[matchedSubPositions=!={}&&(Length[{leftIndices}]>0||Length[{rightIndices}]>0),
            matchedPositions=ConstantArray[allIndices /.{_Span->0,All->0},Length@matchedSubPositions];

            If[Position[allIndices,_List|_Span]=!={},
                xDimensions=Dimensions[x];

                MapIndexed[
                    (
                        Switch[allIndices[[#1]],
                            _List,
                                matchedSubPositions[[All,First@#2]]=matchedSubPositions[[All,First@#2]]/.Thread[(Range@Length@allIndices[[#1]])->allIndices[[#1]]];,
                            _Span,
                                indexSpanRange=Range@@(allIndices[[#1]]/.{All->xDimensions[[#1]],i_Integer?Negative:>xDimensions[[#1]]+i+1});
                                matchedSubPositions[[All,First@#2]]=matchedSubPositions[[All,First@#2]]/.Thread[(Range@Length@indexSpanRange)->indexSpanRange];
                        ]
                    )&
                    ,
                    nonIntegerPositions
                ];
            ];

            matchedPositions[[All,nonIntegerPositions]]=matchedSubPositions;
            ,
            matchedPositions=matchedSubPositions;
        ];

        (*assignment*)
        If[value =!= None && matchedPositions=!={},
            If[NumericQ[value],
                x=ReplacePart[x,matchedPositions->value];
                ,
                If[Length@Dimensions@value==Length@nonIntegerPositions,
                    valuesExtracted=Extract[value,matchedSubPositions];
                    ,
                    valuesExtracted=Extract[value,matchedPositions];
                ];

                x=ReplacePart[x,Thread[matchedPositions->valuesExtracted]];
            ];
        ];

        (*returned values*)
        If[matchedPositions=!={},
            valuesExtracted=Extract[x,matchedPositions];

            valuesReturned = 
                ReplacePart[
                    ConstantArray[{},xDimensions[[nonIntegerPositions]]]
                    ,
                    Thread[matchedSubPositions->valuesExtracted]
                ];

            valuesReturned
            ,
            {}
        ]
    ];
$\endgroup$
4
  • 3
    $\begingroup$ I only noticed this answer right now, but I'd stress again that modifications of built-in functions are the last resort and may have unintended consequences. The Part and Function are among those I would never touch for anything serious. Apart from the general problems which come with overloading built-ins, Part is particularly important. It is easy, for example, to inadvertently make assignments have different complexity (like linear instead of constant time), and break Part in other subtle ways. If nothing else, I'd seriously recommend to put a warning at the top in large bold font. $\endgroup$ Commented May 22, 2012 at 13:40
  • $\begingroup$ In fact, while I don't downvote, this answer is a close candidate for my downvote had I changed my mind. Please don't take it personally, you have many great posts, and I also did all these experiments with overloading built-ins, many times. It's just that there is a difference between our private experimentation and putting this stuff out in a public answer, because many people may get excited by this and not have enough knowledge to see the dangers of this approach (this does not refer to you, of course). $\endgroup$ Commented May 22, 2012 at 13:47
  • $\begingroup$ Yes I hesitated before putting this here, my previous answer didn't modify Part and Function, I left it to interested people. I put the code more for sharing the experimentation, in case someone could improve it. $\endgroup$
    – faysou
    Commented May 22, 2012 at 14:48
  • $\begingroup$ It is a pity that we don't have some place where we could share experimental stuff without all that responsibility, which would be as easily accessible / readable /etc as here. May be some day... $\endgroup$ Commented May 22, 2012 at 14:58

Not the answer you're looking for? Browse other questions tagged or ask your own question.