Episode #561

Aligning and Zooming Images

Series: macOS Mastodon Client

43 minutes
Published on August 3, 2023

This video is only available to subscribers. Get access to this video and 578 others.

When tapping on an image, we want to open a new window with that image, but have it expand from the position in the gallery on the post. To do so, we have to do some window manipulation using NSPanel and some frame calculations.

This episode uses Swift 5.7, Xcode 14.3.

Okay, welcome back.

We have our gallery view here working in the preview, and now what I think we could work

on is when you tap on the individual image that it opens up in a bigger window, and we

can have a transition so it comes from this part of the frame and sort of expands to its

big size.

And so if we run the live preview, every time this ends up sort of throwing it on the other

monitor and then we click on it and we're like, "Okay, that test is done and then I've

got to continue."

So we were chatting off-camera and thinking maybe it would just be easier for this episode

We should just use the image_gallery_previews.previews

as our main window.

And if we do that and we run it, we actually

will run a new instance of the app that

only has this code in it, or this UI.

So that's the entire app now, which is kind of amazing

if you think about old UIKit and AppKit ways of doing

sort of thing, it's not hard, it would just be way more tedious than that.

Which is pretty amazing.

Also, because this is now the app's main window, when I close it and run it again, it should

remember where it was.

So I think we're pretty good to build and run to test this thing out quickly and then

we'll just uncomment it when we're done.


And a tip I actually have for people when doing that, that I'm going to give you as

well is when you are doing something like that where you replace the main

window so the main window content and you want to have it in at a particular

size and position on screen what you can do is run the app then you position it

and size it however you want and instead of like closing the window or stopping

the app in Xcode just do command Q to quit the app because then all of the

state restoration machinery is going to do its work and save that state and then

when you rerun it it will still be there but if you stop it in Xcode or if you

close the window and stop it it's not guaranteed that it will have had the

chance to save it so that's another tip for the poor man's Xcode previous that we're doing.

Also I'm gonna be taking advantage of this clear all issues thing to clear

this phantom issue that just won't go away. I can't wait to be on the new

version of Xcode. Oh it still has that bug so... Oh joy.

Yeah. Amazing. All right. So because of that I think I can just turn off the

preview and give us some more space. So just as a quick recap, so we have this

for each item in the layout we are going to make an image item and when

you tap on it we're gonna show the gallery window manager which there's a

single one and we're gonna show the window with all of those attachments but

starting at the one that you tapped on right so how do we need somehow to pass

the frame that we're coming from to this entry point or how should we tackle that

This is quite tricky because in order to have a transition where we start out

with a window and the image positioned so it's kind of a magic trick. We'll have

a window that starts out with the frame of its contents aligned with the

image on a completely separate window that we don't have access to because we

are in SwiftUI. So to do that in SwiftUI there is no built-in gesture that will

give us the frame in window coordinates of the thing that we clicked on so we'll

have to roll our own as you'd expect. By this point I think folks are already

expecting some app kits to be involved. You can do something if you only need

the position you can use a drag gesture with a minimum distance of zero so then

when you do the on end you can get like the position but it doesn't give you the

actual frame so we'll basically create our own version of a tap gesture that

will leverage AppKits in order to get the actual coordinates and then later on

we'll have to convert those coordinates from the window coordinate space into

the screen coordinate space, which is a whole different thing.

So, yeah, a lot of fun.

Alright, where should we start? So, let's create

a custom gesture, and the way we can do that,

you can't really create custom gestures, there is no mechanism for

that in SwiftUI. I don't know if this year they added something like that.

I think for animations they did, but not for gestures. So we'll have to create a

custom modifier for this. That's my preferred way to do it, at least.

So I'll just put it here for now. So we'll do

custom gesture modifier is a view modifier.

Right. Do you have a better name than custom gesture modifier?

The name I gave it originally was

"OnTap" - no actually that's the name of the extension - so the name of the modifier is

"TapGestureWithSourceRectModifier" that's the name I gave it. And for modifiers I actually like to

to have modifier in the name just to make it clear that it's a modifier.

Yeah and you're always going to do an extension of view and add this right so we'll have a func

onTapWithSourceRect and this one will return modifier with tapGestureWithSourceRectModifier

right and this returns some view yeah and it will also take a block it'll take a block that is a

A view builder?

No, that's actually just a...

Oh no, it's...

It takes an NSRect and returns void, because it's what's going to be performed.

Alright, so we'll do that here.

So our block will be an escape... is it escaping?


No, it's not.

It's very much escaping.

Oh yeah, because you have to hold on to it until you tap on it.

It will escape SwiftUI and go into AppKit land.

Alright, block, block.

Okay, and then our body here.

Yeah, so for the body, let's stub it out and just return content for now, because we have

to actually write what's going to happen in there.

And as you can see, for now, the only thing we have is a SwiftUI View modifier, but I

told you we were going to do some AppKit.

So the modifier is just a way to hide the fact that we're doing AppKit behind the scenes,

which let's be honest, like it's what SwiftUI is doing in most places.

It's doing in macOS, it's doing AppKit and on iOS it's doing UI kits and things like


will need an NSViewRepresentable. So you will have to be creative with the names again.

Let's call it "Tap gesture with source rect view" maybe and that's an NSViewRepresentable.

Tap gesture with source rect view which will be NSViewRepresentable.

And not only that, but we'll have to create our own NSView that will actually do the work.

So it's a lot of boilerplate, unfortunately.

So this will be tap gesture with SourceRect.


NSView, got it.

And for those types of views, what you can do, and what I usually do, is I actually embed

them in the the NSViewRepresentable and I underscore the name of the class. But it can't be private

because you'll have to declare it. And also the NSViewRepresentable must be a struct. It

can't be a class. Oh that's true. Okay so this one is going to return our tap gesture

right? Right, and you have to change it. And then Rvotype here, tap gesture, okay. So,

so we... Let's make it... We actually have to make that, so we'll create a custom initializer

for that, for that and make it receive the, the block, so we'll have to pass the block

along all the way to AppKit basically. And in more complex situations where

like I have a more complicated callback I will declare a type alias for it in

this case since it's a simple one and we know that this is what it's going to be

we don't have to create a type alias but imagine if we wanted to change the

signature it will be quite annoying we will have to change it every time.

I have found that when doing Combine subscriptions and NSViewRepresentables, we have the same kind

of problem where you're like, "Okay, you need this input.

Okay, now you need to just take that input and pass it here."

And then that input needs to be passed here because the other thing is when you have make

coordinator you often have to pass all of your input to the coordinator too.

It's just kind of an interesting set of behaviors that they share when you start customizing

stuff like this.

But the end result, I think, is worth it, because what we get is this, which is amazing.

And I have, I must confess that I have already taken this modifier that I wrote originally

for this series and used it in projects.

So this is one of those things that, when I need something like this, I'll often create,

and I think that's what I did when I was working in this implementation, is I have created

another project, then I implement it in a reusable fashion, and that's work that you

do once, and then it's done, and you can reuse it whenever you want, and when you actually

just use it, it looks like SwiftUI. You don't have to worry about the fact that it's doing

all of this under the hood. But yeah, it is a bit of work to get started.

So this one also needs the block parameter.

Right. One more time.

So when I pass it in, I do that one more time.

Okay, so we pass it along.

So now we have it in our custom view.

And I'll save it.

Right, because we don't want to use it yet, right?

Yeah, and you want to do super dot init, and you'll use the frame version with dot zero.

It doesn't really matter in this case.

And for the make NSView up there, you can remove the let.

You have to do the the coder thing.

That's what I think when I'm implementing with coder.

Sorry, you were just saying I needed one more thing.

The let view you can remove, you can just return.

That's basically it.

is basically just a simple wrapper around the NSView. It doesn't really do

anything. So the actual implementation is in the NSView and what we want to do

for the NSView is override the... and this will depend on what behavior we want but

I chose to override mouse down because I want the event to happen when you press.

I don't want to have to wait for the click to come up and we are not going to support double clicking

So that's not really an issue in this case and you can call super in there just to be nice

Right, so

One more thing to override just so that we don't forget it and this is also depending on what behavior you want

But I also chose to override

accepts first mouse

for event

that one and just return true and

what that does is you know when you have a window on Mac OS and it's not

Like in the foreground, but you can see it

There are some controls that you can click and some that you can't and that's what accepts first mouse does so

basically if the person has

the timeline in the window, but they are like in Safari, but they can see the timeline and they click one of those images it

Should open when they click without having to activate the app first

Okay. Yeah, otherwise it first click would activate then second click would do this right? Okay. Got it

Okay, so at this point now we have a mouse down the event has with it

the location in the window that you tapped and it's got absolute coordinates and is there anything else that's useful here?

We actually don't care about the event at all in this case.

And that's because

the place where the click happened doesn't matter because we are

the rectangle represented by this view

after we add it to the modifier later will be the rectangle of

our image in the gallery. So it doesn't matter if the click happened at the

bottom left or top right, it just matters that they clicked the overall rectangle

represented by this view. So all we need to do is to take the view's frame and

convert it to window coordinates. And then we have to take the window

coordinates and convert to screen coordinates.

Okay, so views, frame into window coordinates, then window into screen.

Okay, so we have convert, rect, which is frame to window, right?

And I think window is optional, isn't it?


So let's do some unwrapping, yeah, for the window.

And to convert the frame to the window, you have to set the "to" parameter to nil.

That's how you tell it to do the window conversion.

But we'll have to do one more step, actually.

And that's because we are inside the scroll view.

If you remember our timeline, we will be inside a scroll view.

And scroll views are weird.

So they do some bounds manipulation and there's the clip view, which is a whole other thing

that we'll probably learn about soon.

So we'll actually have to ask our superview to convert our frame to the window.

So it's a bit…

So instead of asking ourselves, we need to ask superview.

So should we do some sort of like guard... let's see... scroll view... ooh.

Yes, that's a thing.

But we don't have to do that because this is not going to hurt if there is no scroll

view involved.

And actually...

Okay, so if there's... but we do want to make sure there's a super view, right?

Right, yeah.

It is just super view.

Auto complete is being done.

And the V is not capitalized.

Oh, okay.

So this warrants a comment.

So we got our frame within our window.

So this is, if we wanted to add like an overlay, we wouldn't do that if we're using Swift

But just as an example, if you wanted to add an overlay view to the exact place above our

view in the window and just like add it to the windows content view without actually

adding it to the view hierarchy, this would be the frame we would be using because that's

exactly where our rectangle is within the window.

But we are going to be using this to position another window and the window doesn't work

in window coordinates.

works in screen coordinates.

And just to back up one second, in case it's not clear, what we are doing with this is

we're actually overlaying an AppKit view to be exactly the same size as each one of these



Right, so...

We aren't overlaying yet, by the way.

We will be.


So, that's the piece that I think that is maybe missing from the discussion is that

when you're doing mouse down and this frame that we're converting, this frame just happens

to be or will happen to be the same exact size as the image because we will overlay

it and it will expand to fill.


Exact same size and it will be in the right place within the view hierarchy so that we

can get the correct frame in window coordinates.


So converting to screen coordinates, I absolutely have done this before

with my app side mirror, but I can't remember how I did that.

It's quite simple, so we'll just ask the window, which we've already unwrapped,

to convert to screen and we'll give it the window frame.

Screen frame.

And that's what we pass to the onTap or the block that we created.

Okay, so here, are we overlaying this now?


So, this will be our tap gesture source rect view with block.

Okay, amazing.

And then I have now a screen frame here.

And let's just print it out and see if this works.

Okay, and check it out, I'm getting a screen frame.

These numbers are... well, it's gonna be the top, the lower left corner of each view, right?

Yeah, and it's the reason why X is so large is probably because you are in the display

that's to the right.

I have three monitors.

And you're in the middle one.

I'm in the middle one right now, yeah.

And also the display order sometimes gets wonky, and so even though this one is positioned

in the middle, sometimes it thinks it's display number three.

For some reason I don't understand.

But I don't know if we talked about this during the series, but the way screen coordinates

works on macOS is basically when you're using the extended display, you have a huge canvas

that encompasses all of your displays.

So that's why you see that the X value is so large, it's because you have a display

to the left, and it's probably like a fairly high-res display, so that's taking into account

the whole width of that display, and then... and you might have displays where like the

Y value is negative, for example, if you have it at the top or bottom, so...

Because zero for the main screen, this is probably worth going to the displays pref

pain to show it. If you go to here and say Arrange, so this is my MacBook and it actually

is slightly lower because it's in my physical space it kind of looks like this. And I want

to be able to drag a window and have it line up, like the edges of the window line up exactly

when I do that. So when I go to the bottom of this one it isn't at zero. It's at 200

or whatever that is. And then conversely, if I were to drag it...

oh goodness, probably not a good idea. Yeah, while recording, we'll see if the

screen recording software doesn't get confused.

So now this stuff would be negative at the bottom because this is where the

zero line would be because of the main

display. Right. All right, I'm gonna try to

restore it where it was. Okay, so now that we sort of have a good understanding of this overlay and

what this rect means in terms of all the monitors, we now need to open this window

starting from that frame, right? Right. So how did you do that? Did you say starting frame or...

I used SourceRect as the label. I think that's what Apple's API tends to use when there's

this concept of a source frame. So I just use SourceRect.

I like that name because it's not really the frame of your whole screen.


All right, so let's go to gallery window manager, which is here, and we'll give it

"source rect is a NSRect".

Okay, so now we can set the window controller's frame.

windowcontroller.window.setframe. Let's try that. Okay, source rect. Does this mean

update the contents and draw it again? Yeah, that basically means it's going to do set needs display

on everything. I don't know to which extent that still matters these days, but since we are doing

this before we show the window, I think it does maybe, and we have to take out the center at the

bottom there because otherwise it will just re-center.


Okay, so if we run it just like this, then we'll see, or we should see, it pop up not

quite in the exact spot because we have a title bar, and the size of the image is resizing

our window, right?

That's true.

we'll have to do some work to make this work the way we want. So the first thing

should I subtract the title bar from this or should we do something with

aligning the windows content rect? So you don't have to do that manually we'll do

that now but there are facilities to do that but the first thing I want to do is

actually change the window style because if you remember the original demo we

gave it wasn't using the traditional Mac window style but I am thinking since

we're using a window controller should we maybe just pass the source rect along

and do all of the configuration in the window controller instead yeah that's a

good idea so that we don't keep messing with the window yeah that's kind of the

point of the window controller right is to give it the control of the

responsibility. Yeah, okay.

Okay, so we've got now

window.stylemask and window.setframe.

Right, so I will

propose we kinda start over

with the creation of the window because we will have to change

quite a few things.

let's first instead of using an NSWindow we'll use something a bit different which is an NSPanel

and an NSPanel is an NSWindow so it inherits from NSWindow but there are different behaviors for an

NSPanel it's basically it's kind of an old thing because it was made for those floating inspectors

which were a thing back in I guess since the next step days and early Mac OS and later OS 10.

These days they aren't that used anymore but you still see them every now and then and something

like quick look so those types of things they are usually an s panel so the the reason you use an

in NSPanel is usually because it's like an auxiliary window. It's not like an actual

main window. And QuickLook is actually a good comparison, because it's kind of what we're

doing. We're basically re-implementing QuickLook. Yeah, you tap on the thing, it expands to

show it to you, and maybe if you hit escape it goes away. Yeah, and there is a QuickLook

API you can use for getting this sort of UI in your app, but it's not very nice, not even

in AppKit and to actually get it to work with the SwiftUI stuff, I don't think it's worth


I think it's actually easier and nicer to create our own implementation.

Right, so we'll start out with the content rect of the NSPanel being our source rect.

So you can use that initializer.

By the way, there is a very important thing to note here.

If you are following along and you get the autocomplete, you might be wondering if you

don't get the same result we'll get.

There is a version of this initializer that takes a screen as an input, and I think we

mentioned it.

And that's not the one we want in this case, because the source rect we have is in that

huge canvas where all of the displays are.

And if you use the version of the initializer that takes a screen, it will consider the

source rect only within the world of that one screen you're using.

So in this case, we do not want that.

Yeah, that makes sense.


And then style mask.

What are we going to do for this?

So StyleMask, let's do HUD window, which is the dark rounded window with a little tiny


It's also going to be a utility window.

Oh, okay.

Yeah, so you see that utility window, the documentation is, it's an NSPanel.

So basically you have to use Utility Window in an NSPanel.

It's also going to be closable, of course, we want people to be able to close it.

And it's also going to be resizable, because we want people to be able to resize it.

And it's also going to be titled, because it will have a little title area.

Okay, so now we have a panel, it's got its style mask, we are setting its content view,

we have to call super, right?

self because we're in a convenience. Okay and we're passing in the panel as a

window right? Yeah we will have to change also for the root view of the NS hosting

view we want to give it a flexible frame but we wanted to start out with the the

size that it came with from the source rect so we'll put a frame modifier in

in the root view.

So this will be the min max one.

- Min max.

Min max, right?

- Right.

- So min is source rec dot min?

- Source rec dot width, yeah.

- Dot width, rect.

Dot width, max is infinity.

- Right.

And I didn't play around with this,

but I think we could probably like actually use

a smaller min width and min height

and set the ideal size to be the one from the source rect.

So that if you want to make it smaller after you open it,

you can, but I think it doesn't really make sense

because if you are clicking, you want it to be larger.

So I decided to do it this way.

- Yeah, that makes sense.

OK, is this enough to run?

We can try that.

Now we pass source rect along.

Hey, check it out.

It is super close to being the same height.

We're off by this title bar difference.


Oh, wait, no, we are not.

Actually, no.

I'm going to zoom in.

And this is exactly in the right spot.

Oh, that's incredible.

You can tell because the dog image doesn't change at all when I do this.

Okay, that's amazing.

Love it.

So, the reason it's working now, and I have to give you a concept, yet another one in

this video.

I think this is pretty dense already, but when you initialize the NSPanel, notice that

the parameter label is contentRect.

Oh, yeah, yeah.

It's not frame, and there is a difference.

So the NSWindow has a contentRect, so that's the rectangle of the content area of the window

in screen coordinates, which is exactly what we have.

So we have the source rect.

But there's the window frame and the window frame is the whole thing.

So the window including decoration, so including borders, title bars and that sort of thing.

So that's why when you did set frame, it wasn't really working because it wasn't accounting

for the title bar.

And there is a method on NSWindow and of course NSPanel that is frameRect for sourceRect.

And there's actually a static version which is frameRect for sourceRect with style mask.

So if you need for some reason to get the frame that you have with a given content rect,

with a given style mask, even before you create a window.

Yeah, without having to create it first.

So it's nice that they give you those facilities to calculate those metrics.

And we will actually use that because in order to animate exactly the way we want, we will

need a new content rect.

So we will have, when we do the actual setting the frame with an animation, we will have

to convert to a frame rect.


So at this point we have no animation, right?

And no growing even.

No growing.

So this is still going to be in AppKit land where we animate, or can we do this in SwiftUI


It's all gonna be AppKit.

Because I feel like we are pretty close to saying, like, initialize with an initial frame

and a target frame and then have the animation be in SwiftUI.

I don't know if that's going to work with AppKit with the panel.

You know what?

I didn't try that, but that's definitely something we could try.

I'm not sure to which extent AppKit will be happy to animate the window frame based on

something that's happening in SwiftUI, because the root view does define the constraints

for the window, because you won't be able to resize the window to be smaller than the

the min width and min height in this example.

So it is defining the frame of the window to some extent,

but I don't know to which extent it will respect that

if it changes and with the SwiftUI animation,

because view animations and window animations

are kinda different in AppKits.

And the way I did it originally was with window animations,

but we could definitely try that out.

That would be, if that works, I will be quite happy actually.

Okay, well we'll stick with the solution that you are familiar with.

What is the next step then?

So let's actually override showWindow for the window controller, which is what we are

calling to display the window.

And in this case, we don't want to call super because we will be customizing the everything

basically. So instead of just when we call showWindow instead of just popping the window on screen

we want to pop it on screen but then immediately start an animation to grow the the size of its

frame. So the first thing we need is to actually know which frame we want it to be and in this case

I was thinking about different ways to achieve this and my solution, which could be like maybe not ideal

and probably breaks in some situations, but was to just inset the original rectangle by a given amount.

And in this case we will use negative insets to grow the rectangle instead of shrinking it.

Yeah, that makes sense. I think that it's not gonna match the image exactly, but I think that's fine.

Because we can still resize the window. Also, we don't want this window to match only the one image

you tapped on, because then if you hit the next arrow to go to the next image, then it, you know,

what if you had one wide and one tall? You want something that's gonna accommodate sort of a

general image right? Right. So the first thing we want to do is unwrap the window

property. So just do a guard. It's annoying that it's optional but this is

from the nib, zib, and then storyboard days. So it could in theory be optional

but it's not in this case. So we have the frame, we set the frame for the panel at

at the initializer. So now we will actually just use the frame property of the window as the... it's

basically gonna be the source rect but accounting for the title bar and in this case we do want it

to be like that. So you can inset and the value I used was negative the width so basically make it

it twice as big. Okay. And that will be frame.width. Negative window.frame.width. Yep. And dy would

be negative frame.height. Yeah. So that's basically making it twice as large. However.

Is this not making it...

Oh, no, no, okay, that does account for the total delta.




And actually, in this case, we will not have to do any conversion, because we took the

frame from the window already.

What I'm not entirely sure, and this is because we are doing the implementation a little bit

differently than my original one. So there's this thing called the run loop

which you're probably familiar with and I think we're not gonna have any issues

in here but I'm wondering if window.frame will be what we expect at this

point when we call showWindow and we will find out soon enough. So let's just

try that. So should I just do set frame or sorry window dot frame just for now

just so we can see it? Let's do that. You actually have to call set frame you can

set it directly. Okay.

Set frame frame rect animate? Oh that's handy right?

sure animate true however we're not showing the window how do you show the

window I always do it on a window controller make key and order front

that's the that's right make key and order front and this is self or sender

just self so it doesn't really know yeah okay so we we show the window and then

we animate to the larger frame I don't know if this is gonna work but holy moly

it worked that's pretty amazing now it doesn't show the image while it's

loading also we should probably have a max. Yeah that's gonna get unwieldy

but yeah so it doesn't show the contents while it's animating but this is

actually pretty cool and it's not a lot of code once you have gone through the

explanation and you know you know why each one of these things is is necessary.

And again this is something that could be fairly reusable. In this case we're

not making it very usable because we are using the media attachment type but if

we wanted to make a more generalized version it wouldn't be that hard or if

you need it in another project to just copy it over. We will, as you mentioned, we

don't see the image as it animates of course because the images are being

loaded from a server and you do see the little spinner and then the image

appears so we will have to do some work and probably do some changes to how our

remote image works so that we can get a snapshot basically of the not really a

snapshot but basically give me the image you already have so that I can display

it while animating because we are seeing the image already so we we have it

somewhere and we'll have to do that work as well in order to get the perfect transition.

I did exactly that technique in the NS Screencast app for iPad, the iPad layout. When you tap on it,

it would expand the image and I eventually got rid of that code for reasons that I don't remember

now because this app has gone through many transitions in the decade it's been around.

But yeah, I did do something like that where I have like,

okay, give me the thumbnail that you have now

and I'll download the new one while you're transitioning.

- Right.

- And it looks okay in most cases.

Sometimes you end up blowing up a really ugly thumbnail

and you have to wait, but maybe blur would be,

thumbnail plus blur might actually look a little bit better

than stretching the pixels.

I don't know.

- Yeah, and there are other techniques like blur hash,

which I do think we do get from Mastodon's API, so if you want to do that, you could

do that as well.

So there are other ways you can approach it.

But usually if you have a decent internet connection, by the time the transition finishes,

it will have the large image, so you won't see the blurry version for too long.


Okay, I think this is a great stopping point.

we have an animated window.

And we dove into AppKit, which is pretty fun.

- Nice.

- All right, we will see you in the next one.